Skip to content

Commit

Permalink
Updated README and Conda environment configuration file
Browse files Browse the repository at this point in the history
  • Loading branch information
ribesstefano committed Jun 7, 2024
1 parent 542ca5f commit 2ec8d8c
Show file tree
Hide file tree
Showing 3 changed files with 427 additions and 16 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -165,5 +165,6 @@ cython_debug/
data/uniprot2embedding.h5
data/PROTAC-DB.csv
data/PROTAC-Pedia.csv
data/cellosaurus.txt
logs/
notebooks/per-protein*
43 changes: 27 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,19 @@
<!-- ![Maturity level-0](https://img.shields.io/badge/Maturity%20Level-ML--0-red)
![Maturity level-0](https://img.shields.io/badge/Maturity%20Level-ML--0-red)
<a href="https://colab.research.google.com/github/ribesstefano/PROTAC-Degradation-Predictor/blob/main/notebooks/protac_degradation_predictor_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PROTAC-Degradation-Predictor -->

<p align="center">
<img src="https://img.shields.io/badge/Maturity%20Level-ML--0-red" alt="Maturity level-0">
</p>
# PROTAC-Degradation-Predictor

<h1 align="center">PROTAC-Degradation-Predictor</h1>

<p align="center">
A machine learning-based tool for predicting PROTAC protein degradation activity.
</p>
A machine learning-based tool for predicting PROTAC protein degradation activity.

## 📚 Table of Contents

- [Data Curation](#-data-curation)
- [Installation](#-installation)
- [Usage](#-usage)
- [Training](#-training)
- [Citation](#-citation)
- [License](#-license)

## 📝 Data Curation

Expand All @@ -36,12 +33,14 @@ The package has been developed on a Linux machine with Python 3.10.8. It is reco

## 🎯 Usage

For a thorough explanation on how to use the package, please refer to the tutorial notebook [`protac_degradation_tutorial.ipynb`](notebooks/protac_degradation_tutorial.ipynb).

After installing the package, you can use it as follows:

```python
import protac_degradation_predictor as pdp

protac_smiles = 'CC(C)(C)OC(=O)N1CCN(CC1)C2=CC(=C(C=C2)C(=O)NC3=CC(=C(C=C3)F)Cl)C(=O)NC4=CC=C(C=C4)F'
protac_smiles = 'Cc1ncsc1-c1ccc(CNC(=O)[C@@H]2C[C@@H](O)CN2C(=O)[C@@H](NC(=O)COCCCCCCCCCOCC(=O)Nc2ccc(C(=O)Nc3ccc(F)cc3N)cc2)C(C)(C)C)cc1'
e3_ligase = 'VHL'
target_uniprot = 'P04637'
cell_line = 'HeLa'
Expand All @@ -51,8 +50,6 @@ active_protac = pdp.is_protac_active(
e3_ligase,
target_uniprot,
cell_line,
device='cuda', # Default to 'cpu'
proba_threshold=0.5, # Default value
)

print(f'The given PROTAC is: {"active" if active_protac else "inactive"}')
Expand All @@ -62,12 +59,26 @@ This example demonstrates how to predict the activity of a PROTAC molecule. The

The function supports batch computation by passing lists of SMILES strings, E3 ligases, UniProt IDs, and cell lines. In this case, it returns a list of booleans indicating the activity of each PROTAC.

## 📈 Training


Before running the experiments, here are some required steps to follow (assuming one is in the repository directory already):
1. Download the data from the [Cellosaurus database](https://web.expasy.org/cellosaurus/) and save it in the `data` directory:
```bash
wget https://ftp.expasy.org/databases/cellosaurus/cellosaurus.txt data/
```
2. Make a copy of the Uniprot embeddings to be placed in the `data` directory:
```bash
cp protac_degradation_predictor/data/uniprot2embedding.h5 data/
```
3. Create a virtual environment and install the required packages by running the following commands:
```bash
conda env create -f environment.yaml
conda activate protac-degradation-predictor
```
4. The code for training the model can be found in the file [`run_experiments.py`](src/run_experiments.py).

## 📈 Training

The code for training the model can be found in the file [`run_experiments.py`](src/run_experiments.py).
(Don't forget to adjust the `PYTHONPATH` environment variable to include the repository directory: `export PYTHONPATH=$PYTHONPATH:/path/to/PROTAC-Degradation-Predictor`)

## 📄 Citation

Expand Down
Loading

0 comments on commit 2ec8d8c

Please sign in to comment.