Skip to content

Commit

Permalink
Move data to tests/test_library
Browse files Browse the repository at this point in the history
  • Loading branch information
JanCBrammer committed Sep 25, 2024
1 parent 755e330 commit 73b52e6
Show file tree
Hide file tree
Showing 17 changed files with 31 additions and 107 deletions.
3 changes: 1 addition & 2 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
INCHI-1-TEST/tests/test_library/config/
INCHI-1-TEST/data/
INCHI-1-TEST/docs/
INCHI-1-TEST/tests/test_library/data/
INCHI-1-TEST/libs/
INCHI-1-TEST/exes/
**/__pycache__/
4 changes: 2 additions & 2 deletions .github/actions/invariance_tests/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,5 @@ runs:
with:
name: ${{ inputs.artifact-name }}
path: |
./INCHI-1-TEST/data/ci/*_invariance_ci.log
./INCHI-1-TEST/data/ci/*_invariance_ci_*.html
./INCHI-1-TEST/tests/test_library/data/ci/*_invariance_ci.log
./INCHI-1-TEST/tests/test_library/data/ci/*_invariance_ci_*.html
4 changes: 2 additions & 2 deletions .github/actions/regression_tests/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,5 @@ runs:
with:
name: ${{ inputs.artifact-name }}
path: |
./INCHI-1-TEST/data/ci/*_regression_ci.log
./INCHI-1-TEST/data/ci/*_regression_ci_*.html
./INCHI-1-TEST/tests/test_library/data/ci/*_regression_ci.log
./INCHI-1-TEST/tests/test_library/data/ci/*_regression_ci_*.html
12 changes: 4 additions & 8 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
INCHI-1-TEST/**/*.so*
INCHI-1-TEST/exes
INCHI-1-TEST/data/**/*.html
INCHI-1-TEST/data/**/*.log
INCHI-1-TEST/data/pubchem/compound
INCHI-1-TEST/data/pubchem/compound3d
INCHI-1-TEST/data/pubchem/substance
INCHI-1-TEST/legacy.bak/
INCHI-1-TEST/libs
INCHI-1-TEST/tests/test_library/data/**/*.html
INCHI-1-TEST/tests/test_library/data/**/*.log
INCHI-1-TEST/tests/test_library/data/pubchem/**
.listing
*.o
__pycache__
Expand All @@ -14,4 +11,3 @@ __pycache__
*.egg-info/
# Ignore core dump files
core.*[0-9]
gcc_crash_report.txt
16 changes: 0 additions & 16 deletions INCHI-1-TEST/data/pubchem/download.py

This file was deleted.

26 changes: 0 additions & 26 deletions INCHI-1-TEST/data/pubchem/utils.py

This file was deleted.

25 changes: 0 additions & 25 deletions INCHI-1-TEST/data/pubchem/validate.py

This file was deleted.

2 changes: 1 addition & 1 deletion INCHI-1-TEST/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ services:
# `source` paths are relative to the `docker-compose.yml` file, not the build context.
# `target` paths are absolute paths in the container. The `/inchi/INCHI-1-TEST` directory already exists in the container.
- type: bind
source: data
source: tests/test_library/data
target: /inchi/INCHI-1-TEST/data
- type: bind
source: tests/test_library/config
Expand Down
22 changes: 11 additions & 11 deletions INCHI-1-TEST/tests/test_library/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,19 +32,19 @@ The test pipeline expects the library under `INCHI-1-TEST/libs`, see `INCHI-1-TE
In this README, `<dataset>` refers to either `ci`
(i.e, continuous integration, aka the tests running on GitHub), or a `<subset>` of PubChem.
`<subset>` can be either `compound`, `compound3d`, or `substance`.
The `ci` data already lives in the repository (i.e., `mcule.sdf.gz` and `inchi.sdf.gz` under `INCHI-1-TEST/data/ci`).
The `ci` data already lives in the repository (i.e., `mcule.sdf.gz` and `inchi.sdf.gz` under `INCHI-1-TEST/tests/test_library/data/ci`).
The PubChem `<subset>` data doesn't live in the repository since it's too large.
You can download the `<subset>` data from <https://ftp.ncbi.nlm.nih.gov/pubchem/> by running

```Shell
python -m INCHI-1-TEST.data.pubchem.download <subset>
python -m INCHI-1-TEST.tests.test_library.data.pubchem.download <subset>
```

On completion of the download you'll find the data in `INCHI-1-TEST/data/pubchem/<subset>`.
On completion of the download you'll find the data in `INCHI-1-TEST/tests/test_library/data/pubchem/<subset>`.
Validate the integrity of `<subset>` (i.e., make sure the downloads aren't corrupted) by running

```Shell
python -m INCHI-1-TEST.data.pubchem.validate <subset>
python -m INCHI-1-TEST.tests.test_library.data.pubchem.validate <subset>
```

Note that validation isn't available for `compound3d` (PubChem doesn't provide file hashes).
Expand Down Expand Up @@ -84,7 +84,7 @@ run-tests --test-config=INCHI-1-TEST/tests/test_library/config/config.regression
```

uses `libinchi.so.<version>`, the shared library specified with `--test-config`,
and generates an `<SDF>.regression_reference.sqlite` file for each SDF under `INCHI-1-TEST/data/<dataset>`.
and generates an `<SDF>.regression_reference.sqlite` file for each SDF under `INCHI-1-TEST/tests/test_library/data/<dataset>`.
The `sqlite` file contains a table with the results for each molfile.

### Run tests against the references
Expand All @@ -94,7 +94,7 @@ run-tests --test-config=INCHI-1-TEST/tests/test_library/config/config.regression
```

uses `libinchi.so.main`, a shared library compiled from the `main` branch,
to compute the results (e.g., InChI strings and keys) for each molfile in each SDF under `INCHI-1-TEST/data/<dataset>`.
to compute the results (e.g., InChI strings and keys) for each molfile in each SDF under `INCHI-1-TEST/tests/test_library/data/<dataset>`.
Those results are compared with the corresponding reference.
Failed comparisons are logged to `<datetime>.regression_<dataset>.log` (where `<datetime>` reflects the start of the test run).

Expand All @@ -111,7 +111,7 @@ parse-log --test-config=INCHI-1-TEST/tests/test_library/config/config.<test>.py
```

where `<test>` can be `regression` or `invariance`.
The command generates an HTML report for each SDF under `INCHI-1-TEST/data/<dataset>` that contains structures which failed the test.
The command generates an HTML report for each SDF under `INCHI-1-TEST/tests/test_library/data/<dataset>` that contains structures which failed the test.
You can view the HTML report in your browser.

## Inspect `.sqlite` files
Expand All @@ -130,16 +130,16 @@ via [volumes](https://docs.docker.com/compose/compose-file/05-services/#volumes)
```yml
volumes:
- type: bind
source: data
source: tests/test_library/data
target: /inchi/INCHI-1-TEST/data
- type: bind
source: config
source: tests/test_library/config
target: /inchi/INCHI-1-TEST/config
```
Note that the `source` paths are relative to the location of the `docker-compose.yml` file.
We're mapping the `data` directory on the host machine to the `/inchi/INCHI-1-TEST/data` directory inside the container.
Similarly we're mapping `config`, a directory containing our [configuration files](#configuration-files), into `/inchi/INCHI-1-TEST/config`.
We're mapping the `tests/test_library/data` directory on the host machine to the `/inchi/INCHI-1-TEST/data` directory inside the container.
Similarly we're mapping `tests/test_library/config`, a directory containing our [configuration files](#configuration-files), into `/inchi/INCHI-1-TEST/config`.

To customize the tests, start by adding your own `docker-compose.custom.yml` file:

Expand Down
2 changes: 1 addition & 1 deletion INCHI-1-TEST/tests/test_library/config/config.ci.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ def get_molfile_id_ci(molfile: str) -> str:
return molfile_id


BASEPATH = "INCHI-1-TEST/data/ci/"
BASEPATH = "INCHI-1-TEST/tests/test_library/data/ci/"

config = DataConfig(
name="ci",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
from inchi_tests.config_models import DataConfig
from inchi_tests.utils import get_molfile_id_pubchem

BASEPATH = "INCHI-1-TEST"
BASEPATH = "INCHI-1-TEST/tests/test_library/data/pubchem/compound"

config = DataConfig(
name="pubchem-compound",
path=Path(BASEPATH).joinpath("data/pubchem/compound"),
sdf_paths=sorted(Path(BASEPATH).joinpath("data/pubchem/compound").glob("*.sdf.gz")),
path=Path(BASEPATH),
sdf_paths=sorted(Path(BASEPATH).glob("*.sdf.gz")),
molfile_id_getter=get_molfile_id_pubchem,
)
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,11 @@
from inchi_tests.config_models import DataConfig
from inchi_tests.utils import get_molfile_id_pubchem

BASEPATH = "INCHI-1-TEST"
BASEPATH = "INCHI-1-TEST/tests/test_library/data/pubchem/compound3d"

config = DataConfig(
name="pubchem-compound3d",
path=Path(BASEPATH).joinpath("data/pubchem/compound3d"),
sdf_paths=sorted(
Path(BASEPATH).joinpath("data/pubchem/compound3d").glob("*.sdf.gz")
),
path=Path(BASEPATH),
sdf_paths=sorted(Path(BASEPATH).glob("*.sdf.gz")),
molfile_id_getter=get_molfile_id_pubchem,
)
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,11 @@
from inchi_tests.config_models import DataConfig
from inchi_tests.utils import get_molfile_id_pubchem

BASEPATH = "INCHI-1-TEST"
BASEPATH = "INCHI-1-TEST/tests/test_library/data/pubchem/substance"

config = DataConfig(
name="pubchem-substance",
path=Path(BASEPATH).joinpath("data/pubchem/substance"),
sdf_paths=sorted(
Path(BASEPATH).joinpath("data/pubchem/substance").glob("*.sdf.gz")
),
path=Path(BASEPATH),
sdf_paths=sorted(Path(BASEPATH).glob("*.sdf.gz")),
molfile_id_getter=get_molfile_id_pubchem,
)
File renamed without changes.
File renamed without changes.

0 comments on commit 73b52e6

Please sign in to comment.