
- Lucene Tutorial
- Lucene - Home
- Lucene - Overview
- Lucene - Environment Setup
- Lucene - First Application
- Lucene - Indexing Classes
- Lucene - Searching Classes
- Lucene - Indexing Process
- Lucene - Indexing Operations
- Lucene - Search Operation
- Lucene - Query Programming
- Lucene - Analysis
- Lucene - Sorting
- Lucene Useful Resources
- Lucene - Quick Guide
- Lucene - Useful Resources
- Lucene - Discussion
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Lucene - Token
Token represents the text or the word in a document with relevant details like its metadata (position, start offset, end offset, token type and its position increment).
Class Declaration
Following is the declaration for the org.apache.lucene.analysis.Token class:
public class Token extends TermAttributeImpl implements TypeAttribute, PositionIncrementAttribute, FlagsAttribute, OffsetAttribute, PayloadAttribute, PositionLengthAttribute
Fields
Following are the fields for the org.apache.lucene.analysis.Token class −
static AttributeSource.AttributeFactory TOKEN_ATTRIBUTE_FACTORY − Convenience factory that returns Token as implementation for the basic attributes and return the default impl (with "Impl" appended) for all other attributes.
Class Constructors
The following table shows the different class constructors −
S.No. | Constructor & Description |
---|---|
1 | Token() Constructs a Token will null text. |
2 | Token(char[] startTermBuffer, int termBufferOffset, int termBufferLength, int start, int end) Constructs a Token with the given term buffer (offset & length), start and end offsets |
3 | Token(int start, int end) Constructs a Token with null text and start & end offsets. |
4 | Constructs a Token with null text and start & end offsets plus flags. |
5 | Constructs a Token with null text and start/ end offsets plus the Token type. |
6 | Token(String text, int start, int end) Constructs a Token with the given term text, and start/ end offsets. |
7 | Token(String text, int start, int end, int flags) Constructs a Token with the given text, start/ end offsets, and type. |
8 | Token(String text, int start, int end, String typ) Constructs a Token with the given text, start/ end offsets, and type. |
Class Methods
The following table shows the different class methods −
S.No. | Method & Description |
---|---|
1 | void clear() Resets the term text, payload, flags, and positionIncrement, startOffset, endOffset and token type to default. |
2 | Object clone() This is a shallow clone. |
3 | Token clone(char[] newTermBuffer, int newTermOffset, int newTermLength, int newStartOffset, int newEndOffset) Makes a clone, but replaces the term buffer & start/end offset in the process. |
4 | void copyTo(AttributeImpl target) Copies the values from this Attribute into the passed-in target attribute. |
5 | int endOffset() Returns the Token's ending offset; one greater than the position of the last character corresponding to this token in the source text. |
6 | boolean equals(Object obj) |
7 | int getFlags() Gets the bitset for any bits that have been set. |
8 | Payload getPayload() Returns this Token's payload. |
9 | int getPositionIncrement() Returns the position increment of this Token. |
10 | int getPositionLength() Get the position length. |
11 | int hashCode() |
12 | void reflectWith(AttributeReflector reflector) This method is for introspection of attributes, it should simply add the key/values this attribute holds to the given AttributeReflector. |
13 | Token reinit(char[] newTermBuffer, int newTermOffset, int newTermLength, int newStartOffset, int newEndOffset) Shorthand for calling clear(), CharTermAttributeImpl.copyBuffer(char[], int, int), setStartOffset(int), setEndOffset(int) setType(java.lang.String) on Token.DEFAULT_TYPE |
14 | Token reinit(char[] newTermBuffer, int newTermOffset, int newTermLength, int newStartOffset, int newEndOffset, String newType) Shorthand for calling clear(), CharTermAttributeImpl.copyBuffer(char[], int, int), setStartOffset(int), setEndOffset(int), setType(java.lang.String) |
15 | Token reinit(String newTerm, int newStartOffset, int newEndOffset) Shorthand for calling clear(), CharTermAttributeImpl.append(CharSequence), setStartOffset(int), setEndOffset(int) setType(java.lang.String) on Token.DEFAULT_TYPE |
16 | Token reinit(String newTerm, int newTermOffset, int newTermLength, int newStartOffset, int newEndOffset) Shorthand for calling clear(), CharTermAttributeImpl.append(CharSequence, int, int), setStartOffset(int), setEndOffset(int) setType(java.lang.String) on Token.DEFAULT_TYPE |
17 | Token reinit(String newTerm, int newTermOffset, int newTermLength, int newStartOffset, int newEndOffset, String newType) Shorthand for calling clear(), CharTermAttributeImpl.append(CharSequence, int, int), setStartOffset(int), setEndOffset(int) setType(java.lang.String) |
18 | Token reinit(String newTerm, int newStartOffset, int newEndOffset, String newType) Shorthand for calling clear(), CharTermAttributeImpl.append(CharSequence), setStartOffset(int), setEndOffset(int) setType(java.lang.String) |
19 | void reinit(Token prototype) Copies the prototype token's fields into this one. |
20 | void reinit(Token prototype, char[] newTermBuffer, int offset, int length) Copies the prototype token's fields into this one, with a different term. |
21 | void reinit(Token prototype, String newTerm) Copies the prototype token's fields into this one, with a different term. |
22 | void setEndOffset(int offset) Sets the ending offset. |
23 | void setFlags(int flags) |
24 | void setOffset(int startOffset, int endOffset) Sets the starting and ending offset. |
25 | void setPayload(Payload payload) Sets this Token's payload. |
26 | void setPositionIncrement(int positionIncrement) Sets the position increment. |
27 | void setPositionLength(int positionLength) Set the position length. |
28 | void setStartOffset(int offset) Set the starting offset. |
29 | void setType(String type) Sets the lexical type. |
30 | int startOffset() Returns this Token's starting offset, the position of the first character corresponding to this token in the source text. |
31 | String type() Returns this Token's lexical type. |
Methods Inherited
This class inherits methods from the following classes −
- org.apache.lucene.analysis.tokenattributes.TermAttributeImpl
- org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl
- org.apache.lucene.util.AttributeImpl
- java.lang.Object