- ChemIcal DatasEt comparatoR (CIDER) is a Python package and ready-to-use Jupyter Notebook workflow which primarily utilizes RDKit to compare two or more chemical structure datasets (SD files) in different aspects, e.g. size, overlap, molecular descriptor distributions, chemical space clustering, etc., most of which can be visually inspected in the notebook.
- To use CIDER, clone the repository to your local disk and make sure you install all the necessary requirements.
We recommend to use CIDER inside a Conda environment to facilitate the installation of the dependencies.
- Conda can be downloaded as part of the Anaconda or the Miniconda platforms (Python 3.10). We recommend to install miniconda3. Using Linux you can get it with:
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ bash Miniconda3-latest-Linux-x86_64.sh
$ git clone https://github.com/Steinbeck-Lab/ChemIcal_DatasEt_compaRator.git
$ cd ChemIcal_DatasEt_compaRator
$ conda create --name cider_chem python=3.10
$ conda activate cider_chem
$ conda install pip
$ python -m pip install -U pip #Upgrade pip
$ pip install .
- Note: Make sure all installations are working correctly by running the tests. You can do this by running the pytest command in the repository root folder.
$ python -m pip install -U pip #Upgrade pip
$ pip install git+https://github.com/Steinbeck-Lab/ChemIcal_DatasEt_compaRator.git
$ pip install cider-chem
from CIDER import ChemicalDatasetComparator
cider = ChemicalDatasetComparator()
data_dir = './data/' # dir with sd files containing molecules
testdict = cider.import_as_data_dict(data_dir)
cider.get_number_of_molecules(testdict)
- The documentation for the CIDER package can be found here.
- Busch, H., Schaub, J., Brinkhaus, H. O., Rajan, K., & Steinbeck, C. (2022). ChemIcal DatasEt comparatoR CIDER (Version 0.0.1-dev) [Computer software]. https://doi.org/10.5281/zenodo.6630494
ChemIcal DatasEt comparatoR is developed and maintained by the Steinbeck group at the Friedrich Schiller University Jena, Germany. The code for this web application is released under the MIT license. Copyright © CC-BY-SA 2024