Lucene Scoring

Home > Lesson > Chapter 9

5 Steps - 3 Clicks

Lucene Scoring

Description

Lucene scoring is the heart of Lucene library. It is blazingly snappy and it conceals most of the diserse quality from the customer. Essentially, it works. At any rate, that is, until it doesn't work, or doesn't fill in as one would foresee that it will work. By then, clients are left plunging into Lucene internals or asking for help on java-user@lucene.apache.org to comprehend why a chronicle with five of our question terms scores lower than a substitute report with one and just of the request terms. While this record won't answer the specific scoring issues, it will, preferably, show the customer the recognizes that can help the specialists to comprehend the what and why of Lucene scoring.

Description

Scoring is particularly reliant on the way archives are listed, so it is essential to comprehend ordering It is additionally accepted that peruses know how to utilize the Searcher. Following is the syntax declaration for the document. [java]public final class Document extends Object implements Serializable[/java] Documents are the unit of ordering and pursuit. A Document is an arrangement of fields. Every field has a name and a literary esteem. A field might be put away for the record, in which case it comes back with inquiry hits on the report. Along these lines, every archive ought to commonly contain at least one put away fields which extraordinarily distinguish it. Following is the syntax declaration for the field. [java]public final class Field extends AbstractField implements Fieldable, Serializable[/java] A field is an area of a Document. Every field has two sections, a name, and an esteem. Qualities might be free content, gave as a String or as a Reader, or they might be nuclear watchwords, which are not further handled. Such catchphrases might be utilized to speak to dates, URLs, and so forth. Fields are alternatively put away in the record, so they might come back with hits on the archive.

Description

Lucene scoring utilizes a blend of the Vector Space Model (VSM) of Information Retrieval and the Boolean model to decide how pertinent a given Document is to a User's inquiry. When all is said in done, the thought behind the VSM is the more times an inquiry term shows up in a record in respect to the quantity of times the term shows up in every one of the archives in the gathering, the more important that report is to the query. Following are the levels of boosting.

Document level boosting
Query level boosting
Document's Field level boosting

Following are the setBoost and getBoost. Following is the syntax for the setBoost. [java]public void setBoost(float boost)[/java] Sets a support calculates for hits on any field of this archive. This esteem will be increased into the score of all hits on this archive. Following is the syntax for the getBoost. [java]public float getBoost()[/java]

Description

Making a CustomScoreQuery is a considerably less requesting thing to do than completing a whole inquiry. There are A many of complex points of interest for executing an outright Lucene address. So while making a custom planning behavior isn't goal and client simply rescoring another Lucene address, CustomScoreQuery is an unmistakable victor. Considering how as regularly as could be allowed Lucene based advancements are used for "cushy" examination.

Description

Following are the fields vailable in algorithm.

Factors	Description
tf(term frequency)	measure of how regularly a term shows up in the report.
coord	Number of terms in the query that already seen in the archive.
idf(inverse document frequency)	measure of how frequently the term shows up over the file
lengthNorm	measure of the significance of a term as per the aggregate number of terms in the field
queryNorm	standardization consider with the goal that questions can be analyzed

Key Points

Scoring is very much dependent on the way documents are indexed.
The customScoreQuery is a much less demanding thing to do than actualizing an entire query.
The recipe utilized for scoring is known as the practical scoring function. ... score(q,d).

Hide Index Show Index

Chapter 9

Lucene Scoring

Basic Info/Lessons

Lucene Scoring

Lucene Scoring

Description

Scoring

Description

Documents And Fields

Score Boosting

Description

Document level boosting

Customizing scoring

Description

Scoring Algorithm

Description

Summary

Key Points