- #1
shivajikobardan
- 674
- 54
- Homework Statement
- Lucene indexing
- Relevant Equations
- none
These are the steps of indexing in Lucene given in our syllabus-:
The first step says that it is creating an index whereas the last step says that it's adding document to index.
What's the difference between these two? Can I get an example.
Here's what I think it should happen-:
1) Collect all words from each documents. Lists it like-;
doc1=>word1,word2,WORD3….wordn
doc2=>word1,WORD2,word3….wordn
And so on.
2) Analyse the words and remove various types of words as per analyzer, process them as per analyzer.
Say now what remains is-:
doc1=>word1,word3,...word(n-1)
doc2=>word2,...word(n-3)
3) Done. Now you can make inverted index as well by converting this to inverted index.
But it's done bit differently, which I'm not 100% clear about.