BertFold is a 30 layers BERT model to predict the distance map of protein 3D structure. ProtBert is used as a pre-trained model.
More details are in my article.
Download pre processed dataset.
mkdir -p data/ProteinNet/casp12
mv *.pqt data/ProteinNet/casp12/
Install requirements. You may have to install apex and torch-scatter manually.
Run train script.
cd src
python run_train.py params/001.yaml
Model | Val | Test |
---|---|---|
ProtBert (seq only) | 4.855 | 7.027 |
ProtBert-BFD (seq and evolutionary) | 4.480 | 6.127 |
Once finishing train the model, predicted distance map is available by using the visualization script