Lucene - Analyzer

The Analyzer class is responsible to analyze a document and get the tokens/words from the text which is to be indexed. Without analysis=, the IndexWriter cannot create index.

Class Declaration

Following is the declaration for the org.apache.lucene.analysis.Analyzer class −

public abstract class Analyzer
   extends Object
      implements Closeable

Class Constructors

The following table shows a class constructor −

S.No.	Constructor & Description
1	protected Analyzer()

Class Methods

The following table shows the different class methods −

S.No.	Method & Description
1	void close() Frees persistent resources used by the Analyzer.
2	int getOffsetGap(Fieldable field) This is similar to getPositionIncrementGap(java.lang.String), except for Token offsets.
3	int getPositionIncrementGap(String fieldName) This is invoked before indexing a Fieldable instance if terms have already been added to that field.
4	protected Object getPreviousTokenStream() Used by Analyzers that implement reusable TokenStream to retrieve previously saved TokenStreams for re-use by the same thread.
5	TokenStream reusableTokenStream(String fieldName, Reader reader) Creates a TokenStream that is allowed to be re-used from the previous time that the same thread called this method.
6	protected void setPreviousTokenStream(Object obj) Used by Analyzers that implement reusableTokenStream to save a TokenStream for later re-use by the same thread.
7	abstract TokenStream tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided Reader.

Methods Inherited

This class inherits methods from the following classes −

java.lang.Object

This analyzer splits the text in a document based on the whitespace.

lucene_analysis.htm