con-text-ver-2

This is version 2 of con-text, a book ranking system, that reads book reviews from the internet and extracts a qualitative maesure for the books based on a set of categories.

How to run

Webcrawling

Firstly the database needs to be set up. For this project we used Firebase. The database is built using the webcrawler stored in the subdirectory Honey. It was written using scrapy. --The webcrawler is currently under construction---

Datapipeline

After the database is built up run the main to extract info from the database:

python main.py 0

Then the word biases are calculated:

python main.py 1

Lastly preparations for the GloVe algorithm are made (the c code in Pre_Post_Processing/src must be compiled and the executable files stored in Pre_Post_Processing/build) :

python main.py 2
gcc -o cooccur.c cooccur
gcc -o glove.c glove
gcc -o shuffle.c shuffle
gcc -o vocab_count.c vocab_count

Then the GloVe algorithm is executed (note the comments at the top of demo.sh if you are running on Mac OS):

./demo.sh

The final results are obtained with the following shell script (this script employs the Stanford Parser and it must be downloaded from the website):

./batch_score_calc.sh

Your results can now be viewed in the database. To touch up the results by performing adaptive histogram equalization and to get .csv plots of your results and performance run these commands:

python adaptive_hist_eq.py
python testing.py

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Pre_Post_Processing/src		Pre_Post_Processing/src
.gitattributes		.gitattributes
README.md		README.md
adaptive_hist_eq.py		adaptive_hist_eq.py
batch_score_calc.sh		batch_score_calc.sh
client.py		client.py
corenlp.py		corenlp.py
demo.sh		demo.sh
jsonrpc.py		jsonrpc.py
main.py		main.py
post_glove.py		post_glove.py
progressbar.py		progressbar.py
score_calculator.py		score_calculator.py
testing.py		testing.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

con-text-ver-2

How to run

Webcrawling

Datapipeline

About

Releases

Packages

Languages

justinmacp/con-text-ver-2

Folders and files

Latest commit

History

Repository files navigation

con-text-ver-2

How to run

Webcrawling

Datapipeline

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages