Welcome to GTE, a powerful Graph Learning Framework designed for the prediction of T-cell Receptors and Epitopes binding specificity.
The project's folder structure is as follows:
-
models folder:
The 'models' folder contains saved models generated by GTE. It includes models for four different datasets, divided into RandomTCR and StrictTCR partitions, each with results for individual folds. In total, you will find 40 models.
The naming convention for model files is as follows:
XXXXX_0123_4
, whereXXXXX
represents the dataset name,0123
represents the fold used for training, and4
indicates that the model is used for testing.You can download our 40 models for inference here.
-
processed_data folder:
This folder contains the raw data for each dataset and the pre-processed 5-fold data. These data are used for training and testing the models.
-
results folder:
In this folder, we store the model's predictions on the datasets. These results can help us analyze model performance and generate further visualizations and reports.
-
Create a Conda Environment:
Start by creating a Conda environment with Python 3.11. If you haven't already installed Conda, you can get it from Anaconda.
conda create -n GTE python=3.11
Activate the environment:
conda activate GTE
-
Install Dependencies:
Use pip to install the required packages listed in the requirements.txt file.
pip install -r requirements.txt
-
How to Run:
To quickly run the program, use the following command:
python inference.py --split RandomTCR --dataset pMTnet
Available options:
-
--split
:- Default: "RandomTCR"
- Choices: ["RandomTCR", "StrictTCR"]
-
--dataset
:- Default: "pMTnet"
- Choices: ["McPAS", "pMTnet", "VDJdb", "TEINet"]
-
--device
:- Default: "cpu"
- Choices: ["cpu", "gpu"]
-
--gpu_id
:- Default: 0
- Description: When using a GPU, this specifies which GPU to use by its ID. The default is the first GPU (ID 0).
- Example:
python inference.py --split RandomTCR --dataset pMTnet --device gpu --gpu_id 0
-
-
Example Output:
You chose the dataset: pMTnet The split method is: RandomTCR Fold: 0, AUC: 0.9113, AUPR: 0.6501 Fold: 1, AUC: 0.9098, AUPR: 0.6438 Fold: 2, AUC: 0.9079, AUPR: 0.6438 Fold: 3, AUC: 0.9077, AUPR: 0.6404 Fold: 4, AUC: 0.9111, AUPR: 0.6512
-
Additional Information:
For more details and customization options, please refer to ours paper. Have fun exploring the GTE framework!
The downloaded test model contains embeddings generated by TCRpeg. If you need embeddings from ESM-2, please refer to ESM-2's GitHub.
Next, simply run the following command:
python train.py --gpu 0 --configs_path configs/pMTnet.yml --droup_out 0.1 --split StrictTCR
Please ensure that the paths in configs/XXXXX.yml
are correct, including the paths for training and testing files, and the embeddings.