Skip to content

Latest commit

 

History

History
56 lines (51 loc) · 1.32 KB

File metadata and controls

56 lines (51 loc) · 1.32 KB

Document Clustering and Visualization

Github repo for CSE 573 project.

Contributors and Team Members: Kunal Suthar, Jay Shah, Leroy Vargis, Abhay Mathur, Vatsal Sodha.

Selenium setup instructions:

  1. Install selenium python package:
    pip install selenium
  2. Install selenium browser driver: This project uses the Firefox driver Install instruction found here

Dependencies setup instructions:

  1. Install all dependencies:
    bash demo.sh install-dep

Project run instructions:

To run the project, use demo.sh file with the following arguments:

  1. Scrape data and save it to a pickle file:
    bash demo.sh scrape
  2. Create the LDA model:
        bash demo.sh create-lda
  1. Apply tsne and generate the dependencies:
        bash demo.sh apply-tsne
  1. Apply pca and generate the dependencies:
        bash demo.sh apply-pca
  1. Generate 3D visualization:
        bash demo.sh visualize-3d
  1. Run the above steps/commands with tsne sequentially from scratch
        bash demo.sh run-project-tsne
  1. Run the above steps/commands with pca sequentially from scratch
        bash demo.sh run-project-pca