Lucene Search Operations

Home > Lesson > Chapter 7

5 Steps - 3 Clicks

Lucene Search Operations

Description

Seeking strategy is again one of the middle helpfulness gave by Lucene. Its stream resemble that of requesting methodology. The fundamental quest for Lucene can be made using taking after classes which can similarly be named as foundation classes for all chase related operations. IndexSearcher is the most imperative and focus fragment of the seeking technique. The following diagram illustrates the searching procedure. IndexSearcher is the most vital and center segment of the searching procedure.

Description

Ordering is a system of changing over substance data into a plan that supports snappy looking for. A direct closeness is a rundown user would find toward the end of a book: That document guides you toward the zone of focuses that appear in the book. Lucene stores the information in the information structure called a revised rundown, which is secured on the archive system or memory as a course of action of rundown records. Most web searchers use a switched record. It allows customers to perform fast watchword look-ups and finds the chronicles that match a given request. Preceding the substance data is added to the rundown, it is taken care of by an analyzer.

Description

Analysis is changing over the substance data into a critical unit of looking for, which is called as term. In the midst of examination, the substance data encounters different operations: removing the words, ousting typical words, neglecting complement, decreasing words to root shape, changing words to lowercase, etc. Examination happens just before requesting and question parsing. Examination changes over substance data into tokens and these tokens are incorporated as terms in the Lucene record.

Description

The most key sort for looking for a record. TermQuery can be fabricated using a single term. The term regard should be case-tricky, notwithstanding, this is not by any methods honest to goodness. Take note of that the terms go for looking should be unfaltering with the terms conveyed by the examination of documents, since analyzers play out various operations on the primary substance before building a record. Taking after is an illustration. [java] /Search mails having the word "java" in the subject field Searcher indexSearcher = new IndexSearcher(indexDirectory); Term term = new Term("subject","java"); Query termQuery = new TermQuery(term); TopDocs topDocs = indexSearcher.search(termQuery,10); [/java]

Description

Every one of the terms are orchestrated lexicographically in the record. Lucene's RangeQuery permits clients to intrigue terms inside a range. The range can be made plans to utilize a beginning term and a fulfillment term, which might be either included or banned. [java]/* RangeQuery example:Search mails from 01/06/2009 to 6/06/2009 both inclusive */ Term begin = new Term("date","20090601"); Term end = new Term("date","20090606"); Query query = new RangeQuery(begin, end, true);[/java]

Description

IndexSearcher gives back an assortment of references to situated rundown things, for instance, records that match a given request. The client can pick the amount of top question things that ought to be recuperated by deciding it in the IndexSearcher's request procedure. Changed paging can be founded on top of this. The client can incorporate a custom Web application or desktop application to show question things. Fundamental classes required in recouping the inquiry things are ScoreDoc and TopDocs. [java]/* First parameter is the query to be executed and second parameter indicates the no of search results to fetch */ TopDocs topDocs = indexSearcher.search(query,20); System.out.println("Total hits "+topDocs.totalHits); // Get an array of references to matched documents ScoreDoc[] scoreDosArray = topDocs.scoreDocs; for(ScoreDoc scoredoc: scoreDosArray){ //Retrieve the matched document and show relevant details Document doc = indexSearcher.doc(scoredoc.doc); System.out.println("\nSender: "+doc.getField("sender").stringValue()); System.out.println("Subject: "+doc.getField("subject").stringValue()); System.out.println("Email file location: " +doc.getField("emailDoc").stringValue()); } [/java]

Description

Applications routinely need to upgrade the document with the latest data and clear more settled data. For example, because of web crawlers, the record ought to be overhauled routinely as new Web pages get included and non-existent Web pages ought to be emptied. Lucene gives the IndexReader interface that allows you to play out these operations on a rundown. [java]// Delete all the mails from the index received in May 2009. IndexReader indexReader = IndexReader.open(indexDirectory); indexReader.deleteDocuments(new Term("month","05")); //close associate index files and save deletions to disk indexReader.close();[/java]

Key Points

Lucene also provides search capabilities for the Eclipse IDE, Nutch, and companies such as IBM, HP.
Lucene underpins parsing of human-entered rich query expressions.
Lucene underpins PhraseQuery, WildcardQuery, RangeQuery, FuzzyQuery, etc.

Hide Index Show Index

Chapter 7

Lucene Search Operations

Basic Info/Lessons

Lucene Search Operations

Lucene Search Operation

Description

The indexing process

Description

Analysis

Description

TermQuery

Description

RangeQuery

Description

Displaying search results

Description

Removing documents from an index

Description

Summary

Key Points