An Inverted Indexer written in Python
This creates an Inverted Index for a given corpus. Inverted Index is a mapping of content (Words, Numbers etc) to its position in various documents. This speeds up query searches on the whole corpus.
To get a local copy up and running follow these simple steps.
- Clone the repo
git clone https://github.com/WasiqMalik/Inverted-Indexing.git
- Install Requirements
pip3 install nltk
- Open up command line or terminal and navigate to the cloned repo's directory
cd "PATH-TO-DIRECTORY"
- Place the blocks of your corpus in numbered sub-directories.
e.g. "PATH-TO-DIRECTORY/1"
- Run the indexer.py file (use python if you have created it as an alias for python3)
python3 indexer.py
Distributed under the MIT License.