Skip to content

Commit

Permalink
Yn update (#294)
Browse files Browse the repository at this point in the history
* add contining training from specific dir[load_model_dir]

* Update model loading to handle different output dimensions in retrain(trasnfer learning)

* update docs

* update unimol format Uni-Mol

* update url dptech-core to deepmodeling

* update version setup

* update unimol v2 docs
  • Loading branch information
emotionor authored Nov 29, 2024
1 parent 3cd979f commit 0fa3d9a
Show file tree
Hide file tree
Showing 18 changed files with 182 additions and 40 deletions.
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@
# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = 'Uni-Mol tools'
project = 'Uni-Mol'
copyright = '2023, cuiyaning'
author = 'cuiyaning'
release = '0.1.0'
release = '0.1.1'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
Expand Down
10 changes: 5 additions & 5 deletions docs/source/data.rst
Original file line number Diff line number Diff line change
@@ -1,36 +1,36 @@
Data
====

`unimol_tools.data <https://github.com/dptech-corp/Uni-Mol/tree/docs/unimol_tools/unimol_tools/data>`_ contains functions and classes for loading, containing, and scaler data, feature.
`unimol_tools.data <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/data>`_ contains functions and classes for loading, containing, and scaler data, feature.

DataHub
-------

Classes and functions from `unimol_tools.data.datahub.py <https://github.com/dptech-corp/Uni-Mol/tree/docs/unimol_tools/unimol_tools/data/datahub.py>`_.
Classes and functions from `unimol_tools.data.datahub.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/data/datahub.py>`_.

.. automodule:: unimol_tools.data.datahub
:members:

Datareader
----------

Classes and functions from `unimol_tools.data.datareader.py <https://github.com/dptech-corp/Uni-Mol/tree/docs/unimol_tools/unimol_tools/data/datareader.py>`_.
Classes and functions from `unimol_tools.data.datareader.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/data/datareader.py>`_.

.. automodule:: unimol_tools.data.datareader
:members:

Datascaler
-----------

Classes and functions from `unimol_tools.data.datascaler.py <https://github.com/dptech-corp/Uni-Mol/tree/docs/unimol_tools/unimol_tools/data/datascaler.py>`_.
Classes and functions from `unimol_tools.data.datascaler.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/data/datascaler.py>`_.

.. automodule:: unimol_tools.data.datascaler
:members:

Conformer
---------

Classes and functions from `unimol_tools.data.conformer.py <https://github.com/dptech-corp/Uni-Mol/tree/docs/unimol_tools/unimol_tools/data/conformer.py>`_.
Classes and functions from `unimol_tools.data.conformer.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/data/conformer.py>`_.

.. automodule:: unimol_tools.data.conformer
:members:
22 changes: 22 additions & 0 deletions docs/source/examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Examples

Welcome to the examples section! On our platform Bohrium, we offer a variety of notebook cases for studying Uni-Mol. These notebooks provide practical examples and applications of Uni-Mol in different scientific fields. You can explore these notebooks to gain hands-on experience and deepen your understanding of Uni-Mol.

## Uni-Mol Notebooks on Bohrium
Explore our collection of Uni-Mol notebooks on Bohrium: [Uni-Mol Notebooks](https://bohrium.dp.tech/search?searchKey=UniMol&amp%3BactiveTab=notebook&activeTab=notebook)

### Uni-Mol for QSAR (Quantitative Structure-Activity Relationship)
Uni-Mol can be used to predict the biological activity of compounds based on their chemical structure. These notebooks demonstrate how to apply Uni-Mol for QSAR tasks:
- [QSAR Example 1](https://bohrium.dp.tech/notebooks/7141701322)
- [QSAR Example 2](https://bohrium.dp.tech/notebooks/9919429887)

### Uni-Mol for OLED Properties Predictions
Organic Light Emitting Diodes (OLEDs) are used in various display technologies. Uni-Mol can predict the properties of OLED molecules, aiding in the design of more efficient materials. Check out these notebooks for detailed examples:
- [OLED Properties Prediction Example 1](https://bohrium.dp.tech/notebooks/2412844127)
- [OLED Properties Prediction Example 2](https://bohrium.dp.tech/notebooks/7637046852)

### Uni-Mol Predicts Liquid Flow Battery Solubility
Liquid flow batteries are a promising technology for energy storage. Uni-Mol can predict the solubility of compounds used in these batteries, helping to optimize their performance. Explore this notebook to see how Uni-Mol is applied in this context:
- [Liquid Flow Battery Solubility Prediction](https://bohrium.dp.tech/notebooks/7941779831)

These examples provide a glimpse into the powerful capabilities of Uni-Mol in various scientific applications. We encourage you to explore these notebooks and experiment with Uni-Mol to discover its full potential.
3 changes: 3 additions & 0 deletions docs/source/features.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# New Features

## 2024-11-22
Unimol V2 has been added to Unimol_tools!

## 2024-06-25

Unimol_tools has been publish to pypi! Huggingface has been used to manage the pretrain models.
34 changes: 31 additions & 3 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,24 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to Uni-Mol tools' documentation!
Welcome to Uni-Mol' documentation!
==========================================

Uni-Mol is the first universal large-scale three-dimensional Molecular Representation Learning (MRL) framework developed by the DP Technology. It expands the application scope and representation capabilities of MRL.

This framework consists of two models, one trained on billions of molecular three-dimensional conformations and the other on millions of protein pocket data.

It has shown excellent performance in various molecular property prediction tasks, especially in 3D-related tasks, where it demonstrates significant performance. In addition to drug design, Uni-Mol can also predict the properties of materials, such as the gas adsorption performance of MOF materials and the optical properties of OLED molecules.

.. Important::

The project Uni-Mol is licensed under `MIT LICENSE <https://github.com/deepmodeling/Uni-Mol/blob/main/LICENSE>`_.
If you use Uni-Mol in your research, please kindly cite the following works:

- Gengmo Zhou, Zhifeng Gao, Qiankun Ding, Hang Zheng, Hongteng Xu, Zhewei Wei, Linfeng Zhang, Guolin Ke. "Uni-Mol: A Universal 3D Molecular Representation Learning Framework." The Eleventh International Conference on Learning Representations, 2023. `https://openreview.net/forum?id=6K2RM6wVqKu <https://openreview.net/forum?id=6K2RM6wVqKu>`_.
- Shuqi Lu, Zhifeng Gao, Di He, Linfeng Zhang, Guolin Ke. "Data-driven quantum chemical property prediction leveraging 3D conformations with Uni-Mol+." Nature Communications, 2024. `https://www.nature.com/articles/s41467-024-51321-w <https://www.nature.com/articles/s41467-024-51321-w>`_.


Uni-Mol tools is a easy-use wrappers for property prediction,representation and downstreams with Uni-Mol. It includes the following tools:

* molecular property prediction with Uni-Mol.
Expand All @@ -14,11 +29,23 @@ Uni-Mol tools is a easy-use wrappers for property prediction,representation and

.. toctree::
:maxdepth: 2
:caption: Contents:
:caption: Getting Started:

requirements
installation
tutorial

.. toctree::
:maxdepth: 2
:caption: Tutorials:

quickstart
school
examples

.. toctree::
:maxdepth: 2
:caption: Uni-Mol tools:

train
data
models
Expand All @@ -27,6 +54,7 @@ Uni-Mol tools is a easy-use wrappers for property prediction,representation and
weight
features


Indices and tables
==================

Expand Down
6 changes: 3 additions & 3 deletions docs/source/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ We recommend installing ```huggingface_hub``` so that the required unimol models
pip install huggingface_hub
```

`huggingface_hub` allows you to easily download and manage models from the Hugging Face Hub, which is key for using UniMol models.
`huggingface_hub` allows you to easily download and manage models from the Hugging Face Hub, which is key for using Uni-Mol models.

### Option 2: Installing from source

Expand All @@ -25,7 +25,7 @@ pip install huggingface_hub
pip install -r requirements.txt

## Clone repository
git clone https://github.com/dptech-corp/Uni-Mol.git
git clone https://github.com/deepmodeling/Uni-Mol.git
cd Uni-Mol/unimol_tools

## Install
Expand All @@ -34,7 +34,7 @@ python setup.py install

### Models in Huggingface

The UniMol pretrained models can be found at [dptech/Uni-Mol-Models](https://huggingface.co/dptech/Uni-Mol-Models/tree/main).
The Uni-Mol pretrained models can be found at [dptech/Uni-Mol-Models](https://huggingface.co/dptech/Uni-Mol-Models/tree/main).

If the download is slow, you can use other mirrors, such as:

Expand Down
10 changes: 5 additions & 5 deletions docs/source/models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,37 +3,37 @@
Models
======

`unimol_tools.models <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/models>`_ contains the models of Uni-Mol.
`unimol_tools.models <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/models>`_ contains the models of Uni-Mol.


Uni-Mol
-------

`unimol_tools.models.unimol.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/models/unimol.py>`_ contains the :class:`~unimol_tools.models.UniMolModel`, which is the backbone of Uni-Mol model.
`unimol_tools.models.unimol.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/models/unimol.py>`_ contains the :class:`~unimol_tools.models.UniMolModel`, which is the backbone of Uni-Mol model.

.. automodule:: unimol_tools.models.unimol
:members:

Model
-----

`unimol_tools.models.nnmodel.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/models/unimol.py>`_ contains the :class:`~unimol_tools.models.NNModel`, which is responsible for initializing the model.
`unimol_tools.models.nnmodel.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/models/unimol.py>`_ contains the :class:`~unimol_tools.models.NNModel`, which is responsible for initializing the model.

.. automodule:: unimol_tools.models.nnmodel
:members:

Loss
-----

`unimol_tools.models.loss.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/models/loss.py>`_ contains different loss functions.
`unimol_tools.models.loss.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/models/loss.py>`_ contains different loss functions.

.. automodule:: unimol_tools.models.loss
:members:

Transformers
------------

`unimol_tools.models.transformers.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/models/transformers.py>`_ contains a custom Transformer Encoder module that extends `PyTorch's nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`_.
`unimol_tools.models.transformers.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/models/transformers.py>`_ contains a custom Transformer Encoder module that extends `PyTorch's nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`_.

.. automodule:: unimol_tools.models.transformers
:members:
45 changes: 40 additions & 5 deletions docs/source/tutorial.md → docs/source/quickstart.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Tutorial
# Quick start

Quick start for UniMol Tools.

## Molecule property prediction

Expand All @@ -20,15 +22,16 @@ custom dict can also as the input. The dict format should be like
```python
{'atoms':[['C','C'],['C','H','O']], 'coordinates':[coordinates_1,coordinates_2]}
```
Here is an example to train a model and make a prediction.

Here is an example to train a model and make a prediction. When using Unimol V2, set `model_name='unimolv2'`.
```python
from unimol_tools import MolTrain, MolPredict
clf = MolTrain(task='classification',
data_type='molecule',
epochs=10,
batch_size=16,
metrics='auc',
model_name='unimolv1', # avaliable: unimolv1, unimolv2
model_size='84m', # work when model_name is unimolv2. avaliable: 84m, 164m, 310m, 570m, 1.1B.
)
pred = clf.fit(data = train_data)
# currently support data with smiles based csv/txt file
Expand All @@ -41,13 +44,45 @@ res = clf.predict(data = test_data)
Uni-Mol representation can easily be achieved as follow.

```python
import numpy as np
from unimol_tools import UniMolRepr
# single smiles unimol representation
clf = UniMolRepr(data_type='molecule', remove_hs=False)
clf = UniMolRepr(data_type='molecule',
remove_hs=False,
model_name='unimolv1', # avaliable: unimolv1, unimolv2
model_size='84m', # work when model_name is unimolv2. avaliable: 84m, 164m, 310m, 570m, 1.1B.
)
smiles = 'c1ccc(cc1)C2=NCC(=O)Nc3c2cc(cc3)[N+](=O)[O]'
smiles_list = [smiles]
unimol_repr = clf.get_repr(smiles_list, return_atomic_reprs=True)
# CLS token repr
print(np.array(unimol_repr['cls_repr']).shape)
# atomic level repr, align with rdkit mol.GetAtoms()
print(np.array(unimol_repr['atomic_reprs']).shape)
print(np.array(unimol_repr['atomic_reprs']).shape)
```
## Continue training (Re-train)

```python
clf = MolTrain(task='regression',
data_type='molecule',
epochs=10,
batch_size=16,
save_path='./model_dir',
remove_hs=False,
target_cols='TARGET',
)
pred = clf.fit(data = train_data)
# After train a model, set load_model_dir='./model_dir' to continue training

clf2 = MolTrain(task='regression',
data_type='molecule',
epochs=10,
batch_size=16,
save_path='./retrain_model_dir',
remove_hs=False,
target_cols='TARGET',
load_model_dir='./model_dir',
)

pred2 = clf.fit(data = retrain_data)
```
26 changes: 26 additions & 0 deletions docs/source/school.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Uni-Mol School

Welcome to Uni-Mol School! This course is designed to provide comprehensive training on Uni-Mol, a powerful tool for molecular modeling and simulations.

## Course Introduction
The properties of drugs are determined by their three-dimensional structures, which are crucial for their efficacy and absorption. Drug design requires consideration of molecular diversity. Current Molecular Representation Learning (MRL) models mainly utilize one-dimensional or two-dimensional data, with limited capability to integrate 3D information.

Uni-Mol, developed by the DP Technology team, is the first general large-scale 3D MRL framework in the field of drug design, expanding the application scope and representation capabilities of MRL. This framework consists of two models trained on billions of molecular 3D conformations and millions of protein pocket data, respectively. It has shown excellent performance in various molecular property prediction tasks, especially in 3D-related tasks. Besides drug design, Uni-Mol can also predict the properties of materials, such as gas adsorption performance of MOF materials and optical properties of OLED molecules.

## Course Content
| Topic | Course Content | Instructor |
|-------|----------------|------------|
| Introduction to Uni-Mol | Uni-Mol molecular 3D representation learning framework and pre-trained models | Chen Letian |
| Uni-Mol for Materials Science | Case study of Uni-Mol in predicting the properties of battery materials | Chen Letian |
| | 3D Representation Learning Framework and Pre-trained Models for Nanoporous Materials | Chen Letian |
| | Efficient Screening of Ir(III) Complex Emitters: A Study Combining Machine Learning and Computational Analysis | Chen Letian |
| | Application of 3D Molecular Pre-trained Model Uni-Mol in Flow Batteries | Xie Qiming |
| | Materials Science Uni-Mol Notebook Case Study | |
| Uni-Mol for Biomedical Science | Application of Uni-Mol in Molecular Docking | Zhou Gengmo |
| | Application of Uni-Mol in Molecular Generation | Song Ke |
| | Biomedical Science Uni-Mol Notebook Case Study | |

## How to Enroll
Enroll now and start your journey with Uni-Mol! [Click here to enroll](https://bohrium.dp.tech/courses/6134196349?tab=courses)

Don't miss this opportunity to advance your knowledge and skills in molecular modeling with Uni-Mol!
6 changes: 3 additions & 3 deletions docs/source/task.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,21 @@
Task
======

`unimol_tools.tasks <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/tasks>`_ oversees the tasks related to the model, such as training and prediction.
`unimol_tools.tasks <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/tasks>`_ oversees the tasks related to the model, such as training and prediction.


Trainer
-------

`unimol_tools.tasks.trainer.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/tasks/trainer.py>`_ contains the :class:`~unimol_tools.unimol_tools.models.tasks.Trainer`, managing the training, validation, and testing phases.
`unimol_tools.tasks.trainer.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/tasks/trainer.py>`_ contains the :class:`~unimol_tools.unimol_tools.models.tasks.Trainer`, managing the training, validation, and testing phases.

.. automodule:: unimol_tools.tasks.trainer
:members:

Split
-------

`unimol_tools.tasks.split.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/tasks/split.py>`_ manages the split methods in the dataset.
`unimol_tools.tasks.split.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/tasks/split.py>`_ manages the split methods in the dataset.

.. automodule:: unimol_tools.tasks.split
:members:
6 changes: 3 additions & 3 deletions docs/source/train.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Interface
Train
-----

`unimol_tools.train.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/train.py>`_ trains a Uni-Mol model.
`unimol_tools.train.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/train.py>`_ trains a Uni-Mol model.

.. automodule:: unimol_tools.train
:members:
Expand All @@ -16,7 +16,7 @@ Train
Predict
------------

`unimol_tools.predictor.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/predictor.py>`_ predict through a Uni-Mol model.
`unimol_tools.predictor.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/predictor.py>`_ predict through a Uni-Mol model.

.. automodule:: unimol_tools.predict
:members:
Expand All @@ -25,7 +25,7 @@ Predict
Uni-Mol representation
------------------------

`unimol_tools.predictor.py <https://github.com/dptech-corp/Uni-Mol/blob/docs/unimol_tools/unimol_tools/predictor.py>`_ get the Uni-Mol representation.
`unimol_tools.predictor.py <https://github.com/deepmodeling/Uni-Mol/tree/main/unimol_tools/unimol_tools/predictor.py>`_ get the Uni-Mol representation.

.. automodule:: unimol_tools.predictor
:members:
Loading

0 comments on commit 0fa3d9a

Please sign in to comment.