org.getopt.luke
Class DocReconstructor

java.lang.Object
  extended by java.util.Observable
      extended by org.getopt.luke.DocReconstructor

public class DocReconstructor
extends java.util.Observable

This class attempts to reconstruct all fields from a document existing in a Lucene index. This operation may be (and usually) is lossy - e.g. unstored fields are rebuilt from terms present in the index, and these terms may have been changed (e.g. lowercased, stemmed), and many other input tokens may have been skipped altogether by the Analyzer, when fields were originally added to the index.

Author:
ab

Nested Class Summary
static class DocReconstructor.Reconstructed
          This class represents a reconstructed document.
 
Constructor Summary
DocReconstructor(org.apache.lucene.index.IndexReader reader)
          Prepare a document reconstructor.
DocReconstructor(org.apache.lucene.index.IndexReader reader, java.lang.String[] fieldNames, int numTerms)
          Prepare a document reconstructor.
 
Method Summary
 DocReconstructor.Reconstructed reconstruct(int docNum)
          Reconstruct document fields.
 
Methods inherited from class java.util.Observable
addObserver, clearChanged, countObservers, deleteObserver, deleteObservers, hasChanged, notifyObservers, notifyObservers, setChanged
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DocReconstructor

public DocReconstructor(org.apache.lucene.index.IndexReader reader)
                 throws java.lang.Exception
Prepare a document reconstructor.

Parameters:
reader - IndexReader to read from.
Throws:
java.lang.Exception

DocReconstructor

public DocReconstructor(org.apache.lucene.index.IndexReader reader,
                        java.lang.String[] fieldNames,
                        int numTerms)
                 throws java.lang.Exception
Prepare a document reconstructor.

Parameters:
reader - IndexReader to read from.
fieldNames - if non-null or not empty, data will be collected only from these fields, otherwise data will be collected from all fields
numTerms - total number of terms in the index, or -1 if unknown (will be calculated)
Throws:
java.lang.Exception
Method Detail

reconstruct

public DocReconstructor.Reconstructed reconstruct(int docNum)
                                           throws java.lang.Exception
Reconstruct document fields.

Parameters:
docNum - document number. If this document is deleted, but the index is not optimized yet, the reconstruction process may still yield the reconstructed field content even from deleted documents.
Returns:
reconstructed document
Throws:
java.lang.Exception