Skip to content

Latest commit

 

History

History
13 lines (7 loc) · 1.28 KB

README.md

File metadata and controls

13 lines (7 loc) · 1.28 KB

Dear John,

I uploaded 12 files for this project. The files "0309homo70_with_labels.txt" and "0309homology70.fasta" are the raw datasets I used for my project, and they are generated by doing homology reduction with the cutoff 0.7.

The files "Accurancy_with_cross_validation_on_raw_database" "Accurancy_with_cross_validation_with_evolutionary_information", "random_forest", "decision_tree" are the python scripts respectively for svm on the raw dataset, svm on the pssm files generated by psi-blast, random forest on the pssm files generated by psi-blast, and decision tree on the pssm files generated by psi-blast. After running either of these four files, the user will get the accuracy got out of a 5-fold cross validation.

The compressed folders "Model_1_svm.tar.gz", "Model_2_randomforest.tar.gz" and "Model_3_randomforest-INPUT-flies.tar.gz" are model scripts and the resources needed when running them. The instructions are inside each of these folders.

The folder "pssm_files.tar.gz" contains all the pssm files I got. The files "Model_3state3line_stride_Xueqing_Wang(random_forest).py" and "Model_3state3line_stride_Xueqing_Wang(svm).py" are the python scripts of my predictor. These three files are included in the three model files and thus can be overlooked.

Thank you for your attention!

Xueqing