Semantic search app using streamlit and txtai.
To use the app, upload a csv that you would like to search. Embeddings will index the rows of the csv. Enter a plain text query and get the best results!
Pandas is used to read the csv file and each row is converted to a string creating a 1-dimensional array. Each string in this dataframe recieves a high dimension embedding (1D vector of floats) using txtai embeddings. To save compute, the data is hashed and a pickle file is created so that the data doesn't need to be reindexed. Lastly, the user enters a query to search the data which uses Approximate Nearest Neighbor in the backend. (Implemented within the txtai search function)
Run it yourself to save me some resources :)
Note: Nvidia drivers must be configured properly for GPU access.
$ git clone https://github.com/Trevato/csv_semantic_search.git
$ pip install requirements.txt
$ streamlit run app.py
$ docker-compose build --up