Skip to content

WangXueqing007/KB8024-Bioinformatics-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dear John,

I uploaded 12 files for this project. The files "0309homo70_with_labels.txt" and "0309homology70.fasta" are the raw datasets I used for my project, and they are generated by doing homology reduction with the cutoff 0.7.

The files "Accurancy_with_cross_validation_on_raw_database" "Accurancy_with_cross_validation_with_evolutionary_information", "random_forest", "decision_tree" are the python scripts respectively for svm on the raw dataset, svm on the pssm files generated by psi-blast, random forest on the pssm files generated by psi-blast, and decision tree on the pssm files generated by psi-blast. After running either of these four files, the user will get the accuracy got out of a 5-fold cross validation.

The compressed folders "Model_1_svm.tar.gz", "Model_2_randomforest.tar.gz" and "Model_3_randomforest-INPUT-flies.tar.gz" are model scripts and the resources needed when running them. The instructions are inside each of these folders.

The folder "pssm_files.tar.gz" contains all the pssm files I got. The files "Model_3state3line_stride_Xueqing_Wang(random_forest).py" and "Model_3state3line_stride_Xueqing_Wang(svm).py" are the python scripts of my predictor. These three files are included in the three model files and thus can be overlooked.

Thank you for your attention!

Xueqing