Skip to content

Information retrieval using pylucene and increase performance using Rocchio algorithm for relevance feedback expansion based on either pseudo-relevance feedback.

Notifications You must be signed in to change notification settings

adham-elsabbagh/IR_Assignment

Repository files navigation

Team

-Veronica Orsanigo
-Adham Mohamed

University of Antwerpen

Information Retrival project

-This module contains cleaning data then index the files and search using query, using pylucene.
-Implementing Rocchio algorithm for relevance feedback.
-query expansion based on either pseudo-relevance feedback.
-this is the pylucene official docs for installation https://lucene.apache.org/pylucene/install.html .

Usage

1-Install pylucene via python 3.

2-python3 main.py.py <path/to/directory>

-This command for cleaning the files from any tags or xml entities or white spaces.
-Adding all files to the index.
-Then you should input your search query and the result will containing the name and the path for the correct answer document.
-Finally implementing query expanssion.

3-There is a fle query.txt you should put in it the query you want to implement it using rocchio algorithm with this style
<100 how to install python or java?>

4-python3 rocchio_algorithm_new.py

-For Implementing rocchio algorithm.

About

Information retrieval using pylucene and increase performance using Rocchio algorithm for relevance feedback expansion based on either pseudo-relevance feedback.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages