Lucene Query Programming

Home > Lesson > Chapter 8

5 Steps - 3 Clicks

Lucene Query Programming

Description

Seeking strategy is again one of the middle helpfulness gave by Lucene. Its stream resemble that of requesting methodology. The fundamental quest for Lucene can be made using taking after classes which can similarly be named as foundation classes for all chase related operations. IndexSearcher is the most imperative and focus fragment of the seeking technique. Lucene uses IndexSearcher to inquiry and it uses Query dissent made by QueryParser as data. In this part, Splessons will discuss distinctive sorts of Query things and ways to deal with make them naturally. Making particular sorts of Query question gives control on the kind of interest to be made.Following are the rundown of Query sorts.

TermQuery
TermRangeQuery
PrefixQuery
BooleanQuery
PhraseQuery
FuzzyQuery
MatchAllDocsQuery

Description

TermQuery is the most normally utilized object of query and is the establishment of numerous intricate inquiries that Lucene can make utilization of. Following are the various methods of the class.

Methods	Description
void addDocument(Document doc)	To add document.
boolean equals(Object o)	Gives back genuine iff o is equivalent to this.
Term getTerm()	Gives back the term of this query.
int hashCode()	Gives back a hash code esteem for this instance.
String toString(String field)	To print the client readable version data.

Description

TermRangeQuery is the utilized when a scope of literary terms are to be looked. Following are the various methods of the class.

Methods	Description
Collator getCollator()	Gives back the collator used to decide go consideration, assuming any.
String getField()	Gives back the field name for the query.
String getUpperTerm()	Gives back the upper estimation.
boolean includesUpper()	Returns genuine if the upper endpoint is comprehensive.
String toString(String field)	Prints a client decipherable variant of this question.

Description

PrefixQuery is utilized to match records whose file begins with a predefined string. Following are the various methods of the class.

Methods	Description
protected FilteredTermEnumgetEnum(IndexReader reader)	Develop the list to be utilized.
Term getPrefix()	Gives back the prefix.
String toString(String field)	Gives back the version of the query.
int hashCode()	Returns the hash code of the value.
String toString(String field)	Prints a client decipherable variant of this question.

Description

BooleanQuery is used to search documents which are result of multiple queries using AND, OR or NOT operators. Following are the various methods of the class.

Methods	Description
void add(BooleanClause clause)	Adds a clause.
Object clone()	Clone of the query will be returned.
String toString(String field)	Gives back the version of the query.
int hashCode()	Returns the hash code of the value.
String toString(String field)	Prints a client decipherable variant.

Description

Phrase query is utilized to pursuit reports which contain a specific grouping of terms. Following are the various methods of the class.

Methods	Description
void add(Term term)	To Add the term at the query.
Weight createWeight(Searcher searcher)	Builds a proper Weight execution for the query.
boolean equals(Object o)	Gives back the boolean values.
int[] getPositions()	Gives back the relative places of terms in this expression.
int getSlop()	To gives back the slop.

Description

FuzzyQuery is utilized to pursuit records utilizing fluffy execution that is an inexact hunt in view of alter separation calculation. Following are the various methods of the class.

Methods	Description
float getMinSimilarity()	Gives back the base closeness that is required for this inquiry to coordinate.
int getPrefixLength()	Gives back the non-fluffy prefix length.
int hashCode()	Gives back the hash values.
Term getTerm()	Gives back the pattern term.
String to String(String field)	To Print an query to a string, with field thought to be the default field and precluded.

Description

MatchAllDocsQuery as name recommends coordinates every one of the reports. Following are the various methods of the class.

Methods	Description
Weight create Weight(Searcher searcher)	Builds a proper Weight execution of the query.
void extractTerms(Set terms)	includes all terms happening in the query to the terms set.
int hashCode()	Gives back the hash values.

Following is an example for MatchAllDocsQuery class. LuceneConstants.java [java]package com.splessons; public class LuceneConstants { public static final String CONTENTS="contents"; public static final String FILE_NAME="filename"; public static final String FILE_PATH="filepath"; public static final int MAX_SEARCH = 10; }[/java] Searcher.java [java]package com.splessons; import java.io.File; import java.io.IOException; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.index.CorruptIndexException; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopDocs; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; import org.apache.lucene.util.Version; public class Searcher { IndexSearcher indexSearcher; QueryParser queryParser; Query query; public Searcher(String indexDirectoryPath) throws IOException{ Directory indexDirectory = FSDirectory.open(new File(indexDirectoryPath)); indexSearcher = new IndexSearcher(indexDirectory); queryParser = new QueryParser(Version.LUCENE_36, LuceneConstants.CONTENTS, new StandardAnalyzer(Version.LUCENE_36)); } public TopDocs search( String searchQuery) throws IOException, ParseException{ query = queryParser.parse(searchQuery); return indexSearcher.search(query, LuceneConstants.MAX_SEARCH); } public TopDocs search(Query query) throws IOException, ParseException{ return indexSearcher.search(query, LuceneConstants.MAX_SEARCH); } public Document getDocument(ScoreDoc scoreDoc) throws CorruptIndexException, IOException{ return indexSearcher.doc(scoreDoc.doc); } public void close() throws IOException{ indexSearcher.close(); } } [/java] Searcher.java class is utilized to peruse the lists made on crude information and hunts information utilizing lucene library. LuceneTester.java [java]package com.splessons; import java.io.IOException; import org.apache.lucene.document.Document; import org.apache.lucene.index.Term; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.search.MatchAllDocsQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopDocs; public class LuceneTester { String indexDir = "E:\\Lucene\\Index"; String dataDir = "E:\\Lucene\\Data"; Searcher searcher; public static void main(String[] args) { LuceneTester tester; try { tester = new LuceneTester(); tester.searchUsingMatchAllDocsQuery(""); } catch (IOException e) { e.printStackTrace(); } catch (ParseException e) { e.printStackTrace(); } } private void searchUsingMatchAllDocsQuery(String searchQuery) throws IOException, ParseException{ searcher = new Searcher(indexDir); long startTime = System.currentTimeMillis(); //create the term query object Query query = new MatchAllDocsQuery(searchQuery); //do the search TopDocs hits = searcher.search(query); long endTime = System.currentTimeMillis(); System.out.println(hits.totalHits + " documents found. Time :" + (endTime - startTime) + "ms"); for(ScoreDoc scoreDoc : hits.scoreDocs) { Document doc = searcher.getDocument(scoreDoc); System.out.print("Score: "+ scoreDoc.score + " "); System.out.println("File: "+ doc.get(LuceneConstants.FILE_PATH)); } searcher.close(); } } [/java] LuceneTester.java class is utilized to test the seeking ability of lucene library. A record index way ought to be made as E:\Lucene\Index. In the wake of running the ordering program amid part Lucene - Indexing Process, you can see the rundown of file records made in that envelope. Now compile the code result will be as follows. [java]12 documents found. Time :9ms Score: 1.0 File: E:\Lucene\Data\record1.txt Score: 1.0 File: E:\Lucene\Data\record10.txt Score: 1.0 File: E:\Lucene\Data\record2.txt Score: 1.0 File: E:\Lucene\Data\record3.txt Score: 1.0 File: E:\Lucene\Data\record4.txt Score: 1.0 File: E:\Lucene\Data\record5.txt Score: 1.0 File: E:\Lucene\Data\record6.txt Score: 1.0 File: E:\Lucene\Data\record7.txt Score: 1.0 File: E:\Lucene\Data\record8.txt Score: 1.0 File: E:\Lucene\Data\record9.txt[/java]

Key Points

WildcardQuery is utilized to hunt archives utilizing wildcards like "*".
MatchAllDocsQuery is used to match all the available documents.
TokenStream is a yield of investigation process and it includes arrangement of tokens.

Hide Index Show Index

Chapter 8

Lucene Query Programming

Basic Info/Lessons

Lucene Query Programming

Lucene Query Programming

Description

org.apache.lucene.search.TermQuery class

Description

org.apache.lucene.search.TermRangeQuery class

Description

org.apache.lucene.search.PrefixQuery class

Description

org.apache.lucene.search.BooleanQuery class

Description

org.apache.lucene.search.PhraseQuery class

Description

org.apache.lucene.search.FuzzyQuery class

Description

org.apache.lucene.search.MatchAllDocsQuery class

Description

Summary

Key Points