Lucene - Analyzer


Advertisements


Introduction

Analyzer class is responsible to analyze a document and get the tokens/words from the text which is to be indexed. Without analysis done, IndexWriter can not create index.

Class declaration

Following is the declaration for org.apache.lucene.analysis.Analyzer class:

public abstract class Analyzer
   extends Object
      implements Closeable

Class constructors

S.N.Constructor & Description
1

protected Analyzer()

Class methods

S.N.Method & Description
1

void close()

Frees persistent resources used by this Analyzer

2

int getOffsetGap(Fieldable field)

Just like getPositionIncrementGap(java.lang.String), except for Token offsets instead.

3

int getPositionIncrementGap(String fieldName)

Invoked before indexing a Fieldable instance if terms have already been added to that field.

4

protected Object getPreviousTokenStream()

Used by Analyzers that implement reusableTokenStream to retrieve previously saved TokenStreams for re-use by the same thread.

5

TokenStream reusableTokenStream(String fieldName, Reader reader)

Creates a TokenStream that is allowed to be re-used from the previous time that the same thread called this method.

6

protected void setPreviousTokenStream(Object obj)

Used by Analyzers that implement reusableTokenStream to save a TokenStream for later re-use by the same thread.

7

abstract TokenStream tokenStream(String fieldName, Reader reader)

Creates a TokenStream which tokenizes all the text in the provided Reader.

Methods inherited

This class inherits methods from the following classes:

  • java.lang.Object


lucene_indexing_classes.htm

Advertisements