Lucene - IndexWriter


Advertisements


This class acts as a core component which creates/updates indexes during indexing process.

Class declaration

Following is the declaration for org.apache.lucene.index.IndexWriter class −

public class IndexWriter
   extends Object
      implements Closeable, TwoPhaseCommit

Field

Following are the fields for the org.apache.lucene.index.IndexWriter class −

  • static int DEFAULT_MAX_BUFFERED_DELETE_TERMS − Deprecated. use IndexWriterConfig.DEFAULT_MAX_BUFFERED_DELETE_TERMS instead.

  • static int DEFAULT_MAX_BUFFERED_DOCS − Deprecated. Use IndexWriterConfig.DEFAULT_MAX_BUFFERED_DOCS instead.

  • static int DEFAULT_MAX_FIELD_LENGTH − Deprecated. See IndexWriterConfig.

  • static double DEFAULT_RAM_BUFFER_SIZE_MB − Deprecated. Use IndexWriterConfig.DEFAULT_RAM_BUFFER_SIZE_MB instead.

  • static int DEFAULT_TERM_INDEX_INTERVAL − Deprecated. Use IndexWriterConfig.DEFAULT_TERM_INDEX_INTERVAL instead.

  • static int DISABLE_AUTO_FLUSH − Deprecated. Use IndexWriterConfig.DISABLE_AUTO_FLUSH instead.

  • static int MAX_TERM_LENGTH − Absolute maximum length for a term.

  • static String WRITE_LOCK_NAME − Name of the write lock in the index.

  • static long WRITE_LOCK_TIMEOUT − Deprecated. Use IndexWriterConfig.WRITE_LOCK_TIMEOUT instead.

Class Constructors

Following table shows the class constructors for IndexWriter −

S.No. Constructor & Description
1

IndexWriter(Directory d, Analyzer a, boolean create, IndexDeletionPolicy deletionPolicy, IndexWriter.MaxFieldLength mfl)

Deprecated. Use IndexWriter(Directory, IndexWriterConfig) instead.

2

IndexWriter(Directory d, Analyzer a, boolean create, IndexWriter.MaxFieldLength mfl)

Deprecated. Use IndexWriter(Directory, IndexWriterConfig) instead.

3

IndexWriter(Directory d, Analyzer a, IndexDeletionPolicy deletionPolicy, IndexWriter.MaxFieldLength mfl)

Deprecated. Use IndexWriter(Directory, IndexWriterConfig) instead.

4

IndexWriter(Directory d, Analyzer a, IndexDeletionPolicy deletionPolicy, IndexWriter.MaxFieldLength mfl, IndexCommit commit)

Deprecated. Use IndexWriter(Directory, IndexWriterConfig) instead.

5

IndexWriter(Directory d, Analyzer a, IndexWriter.MaxFieldLength mfl)

Deprecated. Use IndexWriter(Directory, IndexWriterConfig) instead.

6

IndexWriter(Directory d, IndexWriterConfig conf)

Constructs a new IndexWriter per the settings given in conf.

Class Methods

S.No. Method & Description
1

void addDocument(Document doc)

Adds a document to this index.

2

void addDocument(Document doc, Analyzer analyzer)

Adds a document to this index, using the provided analyzer instead of the value of getAnalyzer().

3

void addDocuments(Collection docs)

Atomically adds a block of documents with sequentially-assigned document IDs, such that an external reader will see all or none of the documents.

4

void addDocuments(Collection docs, Analyzer analyzer)

Atomically adds a block of documents, analyzed using the provided analyzer, with sequentially assigned document IDs, such that an external reader will see all or none of the documents.

5

void addIndexes(Directory... dirs)

Adds all segments from an array of indexes into this index.

6

void addIndexes(IndexReader... readers)

Merges the provided indexes into this index.

7

void addIndexesNoOptimize(Directory... dirs)

Deprecated. Use addIndexes(Directory...) instead.

8

void close()

Commits all changes to an index and closes all associated files.

9

void close(boolean waitForMerges)

Closes the index with or without waiting for currently running merges to finish.

10

void commit()

Commits all pending changes (added & deleted documents, segment merges, added indexes, etc.) to the index, and syncs all referenced index files, such that a reader will see the changes and the index updates will survive an OS or machine crash or power loss.

11

void commit(Map<String,String> commitUserData)

Commits all changes to the index, specifying a commitUserData Map (String > String).

12

void deleteAll()

Deletes all documents in the index.

13

void deleteDocuments(Query... queries)

Deletes the document(s) matching any of the provided queries.

14

void deleteDocuments(Query query)

Deletes the document(s) matching the provided query.

15

void deleteDocuments(Term... terms)

Deletes the document(s) containing any of the terms.

16

void deleteDocuments(Term term)

Deletes the document(s) containing term.

17

void deleteUnusedFiles()

Expert: remove the index files that are no longer used.

18

protected void doAfterFlush()

A hook for extending classes to execute operations after pending added and deleted documents have been flushed to the Directory but before the change is committed (new segments_N file written).

19

protected void doBeforeFlush()

A hook for extending classes to execute operations before pending added and deleted documents are flushed to the Directory.

20

protected void ensureOpen()

21

protected void ensureOpen(boolean includePendingClose)

Used internally to throw an AlreadyClosedException if this IndexWriter has been closed.

22

void expungeDeletes()

Deprecated.

23

void expungeDeletes(boolean doWait)

Deprecated.

24

protected void flush(boolean triggerMerge, boolean applyAllDeletes)

Flushes all in-memory buffered updates (adds and deletes) to the Directory.

25

protected void flush(boolean triggerMerge, boolean flushDocStores, boolean flushDeletes)

NOTE: flushDocStores is ignored now (hardwired to true); this method is only here for backwards compatibility.

26

void forceMerge(int maxNumSegments)

This is a force merging policy to merge segments until there's <= maxNumSegments.

27

void forceMerge(int maxNumSegments, boolean doWait)

Just like forceMerge(int), except you can specify whether the call should block until all merging completes.

28

void forceMergeDeletes()

Forces merging of all segments that have deleted documents.

29

void forceMergeDeletes(boolean doWait)

Just like forceMergeDeletes(), except you can specify whether the call should be blocked until the operation completes.

30

Analyzer getAnalyzer()

Returns the analyzer used by this index.

31

IndexWriterConfig getConfig()

Returns the private IndexWriterConfig, cloned from the IndexWriterConfig passed to IndexWriter(Directory, IndexWriterConfig).

32

static PrintStream getDefaultInfoStream()

Returns the current default infoStream for newly instantiated IndexWriters.

33

static long getDefaultWriteLockTimeout()

Deprecated. Use IndexWriterConfig.getDefaultWriteLockTimeout() instead.

34

Directory getDirectory()

Returns the Directory used by this index.

35

PrintStream getInfoStream()

Returns the current infoStream in use by this writer.

36

int getMaxBufferedDeleteTerms()

Deprecated. Use IndexWriterConfig.getMaxBufferedDeleteTerms() instead.

37

int getMaxBufferedDocs()

Deprecated. Use IndexWriterConfig.getMaxBufferedDocs() instead.

38

int getMaxFieldLength()

Deprecated. Use LimitTokenCountAnalyzer to limit number of tokens.

39

int getMaxMergeDocs()

Deprecated. Use LogMergePolicy.getMaxMergeDocs() directly.

40

IndexWriter.IndexReaderWarmer getMergedSegmentWarmer()

Deprecated. Use IndexWriterConfig.getMergedSegmentWarmer() instead.

41

int getMergeFactor()

Deprecated. Use LogMergePolicy.getMergeFactor() directly.

42

MergePolicy getMergePolicy()

Deprecated. Use IndexWriterConfig.getMergePolicy() instead.

43

MergeScheduler getMergeScheduler()

Deprecated. Use IndexWriterConfig.getMergeScheduler() instead

44

Collection<SegmentInfo> getMergingSegments()

Expert: to be used by a MergePolicy to a void selecting merges for segments already being merged.

45

MergePolicy.OneMerge getNextMerge()

Expert: the MergeScheduler calls this method to retrieve the next merge requested by the MergePolicy.

46

PayloadProcessorProvider getPayloadProcessorProvider()

Returns the PayloadProcessorProvider that is used during segment merges to process payloads.

47

double getRAMBufferSizeMB()

Deprecated. Use IndexWriterConfig.getRAMBufferSizeMB() instead.

48

IndexReader getReader()

Deprecated. Use IndexReader.open(IndexWriter,boolean) instead.

49

IndexReader getReader(int termInfosIndexDivisor)

Deprecated. Use IndexReader.open(IndexWriter,boolean) instead. Furthermore, this method cannot guarantee the reader (and its sub-readers) will be opened with the termInfosIndexDivisor setting because some of them may already have been opened according to IndexWriterConfig.setReaderTermsIndexDivisor(int). You should set the requested termInfosIndexDivisor through IndexWriterConfig.setReaderTermsIndexDivisor(int) and use getReader().

50

int getReaderTermsIndexDivisor()

Deprecated. Use IndexWriterConfig.getReaderTermsIndexDivisor() instead.

51

Similarity getSimilarity()

Deprecated. Use IndexWriterConfig.getSimilarity() instead.

52

int getTermIndexInterval()

Deprecated. Use IndexWriterConfig.getTermIndexInterval().

53

boolean getUseCompoundFile()

Deprecated. Use LogMergePolicy.getUseCompoundFile().

54

long getWriteLockTimeout()

Deprecated. Use IndexWriterConfig.getWriteLockTimeout()

55

boolean hasDeletions()

56

static boolean isLocked(Directory directory)

Returns true if the index in the named directory is currently locked.

57

int maxDoc()

Returns total number of docs in this index, including docs not yet flushed (still in the RAM buffer), not counting deletions.

58

void maybeMerge()

Expert: Asks the mergePolicy whether any merges are necessary now and if so, runs the requested merges and then iterate (test again if merges are needed) until no more merges are returned by the mergePolicy.

59

void merge(MergePolicy.OneMerge merge)

Merges the indicated segments, replacing them in the stack with a single segment.

60

void message(String message)

Prints a message to the infoStream (if non-null), prefixed with the identifying information for this writer and the thread that's calling it.

61

int numDeletedDocs(SegmentInfo info)

Obtains the number of deleted docs for a pooled reader.

62

int numDocs()

Returns total number of docs in this index, including docs not yet flushed (still in the RAM buffer), and including deletions.

63

int numRamDocs()

Expert: Returns the number of documents currently buffered in RAM.

64

void optimize()

Deprecated.

65

void optimize(boolean doWait)

Deprecated.

66

void optimize(int maxNumSegments)

Deprecated.

67

void prepareCommit()

Expert: prepare for commit.

68

void prepareCommit(Map<String,String> commitUserData)

Expert: Prepare for commit, specifying commitUserData Map (String -> String).

69

long ramSizeInBytes()

Expert: Return the total size of all index files currently cached in memory.

70

void rollback()

Closes the IndexWriter without committing any changes that have occurred since the last commit (or since it was opened, if commit hasn't been called).

71

String segString()

72

String segString(Iterable<SegmentInfo> infos)

73

String segString(SegmentInfo info)

74

static void setDefaultInfoStream(PrintStream infoStream)

If non-null, this will be the default infoStream used by a newly instantiated IndexWriter.

75

static void setDefaultWriteLockTimeout(long writeLockTimeout)

Deprecated. Use IndexWriterConfig.setDefaultWriteLockTimeout(long) instead.

76

void setInfoStream(PrintStream infoStream)

If non-null, information about merges, deletes and a message when maxFieldLength is reached will be printed to this.

77

void setMaxBufferedDeleteTerms(int maxBufferedDeleteTerms)

Deprecated. Use IndexWriterConfig.setMaxBufferedDeleteTerms(int) instead.

78

void setMaxBufferedDocs(int maxBufferedDocs)

Deprecated. Use IndexWriterConfig.setMaxBufferedDocs(int) instead.

79

void setMaxFieldLength(int maxFieldLength)

Deprecated. Use LimitTokenCountAnalyzer instead. Observe the change in the behavior - the analyzer limits the number of tokens per token stream created, while this setting limits the total number of tokens to index. This matters only if you index many multi-valued fields though.

80

void setMaxMergeDocs(int maxMergeDocs)

Deprecated. Use LogMergePolicy.setMaxMergeDocs(int) directly.

81

void setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer warmer)

Deprecated. Use IndexWriterConfig.setMergedSegmentWarmer( org.apache.lucene.index.IndexWriter.IndexReaderWarmer ) instead.

82

void setMergeFactor(int mergeFactor)

Deprecated. Use LogMergePolicy.setMergeFactor(int) directly.

83

void setMergePolicy(MergePolicy mp)

Deprecated. Use IndexWriterConfig.setMergePolicy(MergePolicy) instead.

84

void setMergeScheduler(MergeScheduler mergeScheduler)

Deprecated. Use IndexWriterConfig.setMergeScheduler(MergeScheduler) instead

85

void setPayloadProcessorProvider(PayloadProcessorProvider pcp)

Sets the PayloadProcessorProvider to use when merging payloads.

86

void setRAMBufferSizeMB(double mb)

Deprecated. Use IndexWriterConfig.setRAMBufferSizeMB(double) instead.

87

void setReaderTermsIndexDivisor(int divisor)

Deprecated. Use IndexWriterConfig.setReaderTermsIndexDivisor(int) instead.

88

void setSimilarity(Similarity similarity)

Deprecated. Use IndexWriterConfig.setSimilarity(Similarity) instead.

89

void setTermIndexInterval(int interval)

Deprecated. Use IndexWriterConfig.setTermIndexInterval(int).

90

void setUseCompoundFile(boolean value)

Deprecated. Use LogMergePolicy.setUseCompoundFile(boolean).

91

void setWriteLockTimeout(long writeLockTimeout)

Deprecated. Use IndexWriterConfig.setWriteLockTimeout(long) instead.

92

static void unlock(Directory directory)

Forcibly unlocks the index in the named directory.

93

void updateDocument(Term term, Document doc)

Updates a document by first deleting the document(s) containing term and then adding the new document.

94

void updateDocument(Term term, Document doc, Analyzer analyzer)

Updates a document by first deleting the document(s) containing term and then adding the new document.

95

void updateDocuments(Term delTerm, Collection<Document> docs)

Atomically deletes documents matching the provided delTerm and adds a block of documents with sequentially assigned document IDs, such that an external reader will see all or none of the documents.

96

void updateDocuments(Term delTerm, Collection<Document> docs, Analyzer analyzer)

Atomically deletes documents matching the provided delTerm and adds a block of documents, analyzed using the provided analyzer, with sequentially assigned document IDs, such that an external reader will see all or none of the documents.

97

boolean verbose()

Returns true if verbosing is enabled (i.e., infoStream)

98

void waitForMerges()

Waits for any currently outstanding merges to finish.

Methods Inherited

This class inherits methods from the following classes −

  • java.lang.Object

lucene_indexing_classes.htm

Advertisements