Creating index vs adding document to index differences

  • Comp Sci
  • Thread starter shivajikobardan
  • Start date
  • Tags
    Index
In summary, indexing in Lucene involves creating an index and adding documents to it. The first step is to collect all words from each document and process them according to the analyzer. The result is a list of words for each document. Then, the inverted index is created. The final step involves adding the document to the index. This process is done behind the scenes and does not require any knowledge of the API to use.
  • #1
shivajikobardan
674
54
Homework Statement
Lucene indexing
Relevant Equations
none
yl9HczXxZegv04i7pfmBTYHJ9fKg1hgZs5YhCxiNwk4MuaW7fM.png

These are the steps of indexing in Lucene given in our syllabus-:
THSmRNHvZg5Nl4S1QLMbKjyOa5yolaAfGBIQNRDY6hhZ8VpfIQ.png

HnTlCNGxuYfFumBGzDVwfU_zxb9Ht5o6ZQKHuZLmN3tiJa04kY.png

The first step says that it is creating an index whereas the last step says that it's adding document to index.
What's the difference between these two? Can I get an example.

Here's what I think it should happen-:
1) Collect all words from each documents. Lists it like-;

doc1=>word1,word2,WORD3….wordn
doc2=>word1,WORD2,word3….wordn
And so on.

2) Analyse the words and remove various types of words as per analyzer, process them as per analyzer.

Say now what remains is-:
doc1=>word1,word3,...word(n-1)
doc2=>word2,...word(n-3)

3) Done. Now you can make inverted index as well by converting this to inverted index.

But it's done bit differently, which I'm not 100% clear about.
 
Physics news on Phys.org
  • #2
shivajikobardan said:
whereas the last step says that it's adding document to index.
No it doesn't, what you are calling the "last" step simply creates a document; adding it to the index is another step.

shivajikobardan said:
What's the difference between these two? Can I get an example.
Can you get an example of the difference between creating a thing and adding something to that thing? Are you serious?

shivajikobardan said:
Here's what I think it should happen-:
...
This is all done behind the scenes, you don't have to worry about any of this to use Lucene, you just need to learn how to use the API. A good place to learn that is the API documentation itself: https://lucene.apache.org/core/9_2_0/core/index.html
 
  • Like
Likes jim mcnamara

FAQ: Creating index vs adding document to index differences

What is the difference between creating an index and adding a document to an index?

Creating an index refers to setting up a data structure that allows for efficient retrieval of information from a large collection of documents. Adding a document to an index, on the other hand, refers to the process of actually inserting a document into the index so that it can be searched and retrieved.

Why is it important to create an index?

Creating an index is important because it allows for faster and more efficient retrieval of information from a large collection of documents. Without an index, searching through a large amount of data can be time-consuming and resource-intensive.

Can I add a document to an existing index?

Yes, you can add a document to an existing index. This is known as indexing and it involves the process of parsing a document and adding it to the index so that it can be searched and retrieved.

What is the difference between indexing and adding a document to an index?

Indexing involves the process of parsing a document and adding it to an index so that it can be searched and retrieved. Adding a document to an index, on the other hand, refers to the actual insertion of a document into an existing index.

How do I decide whether to create an index or add a document to an existing index?

The decision to create an index or add a document to an existing index depends on the specific needs and requirements of your project. If you have a large collection of documents that need to be searched and retrieved frequently, it may be more efficient to create an index. If you only have a few documents to add, it may be easier to simply add them to an existing index.

Similar threads

Replies
8
Views
1K
Replies
3
Views
2K
Replies
4
Views
3K
Replies
2
Views
4K
Replies
1
Views
2K
Replies
1
Views
1K
Replies
3
Views
2K
2
Replies
67
Views
12K
Replies
3
Views
10K
Replies
4
Views
2K
Back
Top