A collection of atom-atom-mapping utility functions.
The easiest way to use AAMUtils is by installing the PyPI package aamutils.
pip install aamutils
The input is a list of partial atom-atom-maps (AAMs). Data is read line-by-line from a text file. Each line should contain one reaction SMILES.
Here is a simple example extending the partial AAM to a complete AAM. First generate the input data:
echo "CCC[Cl:1].[N:2]>>CCC[N:2].[Cl:1]" > testinput.txt
Next, run AAMUtils to expand the partial AAM.
python3 -m aamutils expand testinput.txt
The output is written to 'testinput_extended.json'.
cat testinput_extended.json
[
{
"input": "CCC[Cl:1].[N:2]>>CCC[N:2].[Cl:1]",
"expanded_aam": "[Cl:1][CH2:5][CH2:4][CH3:3].[NH3:2]>>[ClH:1].[NH2:2][CH2:3][CH2:4][CH3:5]",
"ilp_status": "Optimal",
"optimization_result": 4.0,
"invalid_reaction_center": false,
"reaction_edges": 4
}
]
from aamutils.aam_expand import extend_aam_from_rsmi
rsmi = "CC[CH2:3][Cl:1].[N:2]>>CC[CH2:3][N:2].[Cl:1]"
result_smiles = extend_aam_from_rsmi(rsmi)
print(result_smiles)
>>> "[Cl:1][CH2:3][CH2:5][CH3:4].[NH3:2]>>[ClH:1].[NH2:2][CH2:3][CH2:5][CH3:4]"
To rerun the benchmarks from the paper use the benchmark.py
script. The
reported results can be reproduced by running the following commands:
python3 benchmark.py --remove-mode rc --remove-ratio 0.5 --seed 42
python3 benchmark.py --remove-mode rc --remove-ratio 0.75 --seed 42
python3 benchmark.py --remove-mode rc --remove-ratio 1 --seed 42
python3 benchmark.py --remove-mode keep_rc --remove-ratio 1 --seed 42
Here is an overview of implemented functionality:
- SMILES to graph and graph to SMILES parsing
- Reaction center validity checks
- ITS graph generation
- Expand partial AAM to complete AAM on balanced reactions
- AAMing based on minimal chemical distance (MCD) for balanced reactions
This project is licensed under MIT License - see the License file for details.
This project has received funding from the European Unions Horizon Europe Doctoral Network programme under the Marie-Skłodowska-Curie grant agreement No 101072930 (TACsy -- Training Alliance for Computational)