Text-Search-Algorithm

This code was written as part of Information retrieval Coursework.

Search Algorithm in easySearch.java

The search algorithm implemented in easySearch.java implements the below formula for ranking the documents --

where q is the user query, doc is the target (candidate document in AP89), t is the query term, c(t,doc) is the count of term t in document doc, N is total number of documents in AP89, and k(t) is the total number of documents that have the term t. Please use Lucene API to get the information. From retrieval viewpoint, (c(t,doc))/(length(doc)) is called normalized TF (term frequency), while log⁡(1+N/k(t) ) is IDF (inverse document frequency).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Text-Search-Algorithm

Search Algorithm in easySearch.java

Files

README.md

Latest commit

History

README.md

File metadata and controls

Text-Search-Algorithm

Search Algorithm in easySearch.java