Skip to content

Testing SPAdes on TeamCity

Andrey Prjibelski edited this page Feb 1, 2022 · 3 revisions

SPAdes project, as well as many other software tools created in Center for Bioinformatics and Algorithmic Biotechnology, are tested on TeamCity continuous integration (CI) server, provided by the courtesy of JetBrains.

To learn more about TeamCity you may want to take a look at the official documentation.

TeamCity projects structure

TeamCity server contains over 100 different SPAdes tests. For convenient navigation they are separated into different projects:

  • SPAdes basic tests: includes small and quick tests covering essential functionality, typically launched each commit
  • SPAdes use-cases: contains a single test for each SPAdes mode, launched several times per week;
  • SPAdes workflow tests: tests on various user interface options;
  • Main SPAdes tests: genome assembly pipeline tests with various types of data;
  • Projects for deeper testing of various SPAdes modes:
    • BiosyntheticSPAdes
    • cloudSPAdes;
    • coronaSPAdes;
    • metaSPAdes;
    • plasmidSPAdes;
    • rnaSPAdes;
  • Related projects:
    • K-mer tools;
    • SPAligner.

When creating a new test or a subproject, be considerate and locate it in the correct project.

Types of tests

Most of the tests can be divided into 3 subcategories based on what is checked:

  • Simple tests check for exit value: unit tests, compilation tests, Python version tests etc.
  • Checkpoint tests compare intermediate checkpoint files with etalon files.
  • Assembly tests check whether certain QUAST metrics are within a certain specified range.

Test configurations

Setting up TeamCity

All tests are configured with the TeamCity web interface and are launched via "Command Line" test type. Simple tests are set up with an appropriate command through a web interface.

Checkpoints and assembly tests are launched via a special utility: src/test/teamcity/teamcity.py

This utility takes as input a configuration .info file which contains all settings for this specific test. Note, that teamcity.py and configuration files can be used on the server for reproducing the test under any environment independently of TeamCity server.

teamcity.py performs the following actions:

  • Compiles SPAdes;
  • Runs SPAdes using the options set in the config file;
  • Runs QUAST/metaQUAST/rnaQUAST on the resulting contigs and scaffolds;
  • Checks whether QUAST results satisfy thresholds provided in the configuration file;
  • Checks whether checkpoint files are identical to the etalon checkpoint files;
  • Saves SPAdes output to a contig storage: /Nancy/teamcity/contig_storage.

Failure at any step raises a certain non-zero code and halts the execution.

Configuration files

The configuration file contains all information for running SPAdes and supplementary tools (e.g. QUAST). Parameter name and value are simply given on the same line separated by tab or spaces. Blank lines are ignored, comments can be made with a semicolon symbol (;). Example of .info configuration file is given below.

;output folder, do not change
output_dir /Nancy/teamcity/output/
;dataset name, must be identical to the name of the test in TeamCity 
name TEST_NAME
; location to save resulting contigs
contig_storage /Nancy/teamcity/contig_storage/pipeline_tests/TEST_NAME/
; set false to use pre-compiled SPAdes
spades_compile true
;spades options, excluding output folder (-o) and disable gzip output,
;all files must be either specified by an absolute path, either relatively to this config file
spades_params " --sc -m 25 -t 16 -1 1.fastq -2 2.fastq”

;quast options, excluding output (optional). ;
;will not assess contigs if not given
quast_params " -R ref.fasta "
;QUAST folder, do not change
quast_dir  /home/teamcity/quast_latest/
; set false to ignore contigs.fasta
assess true
; the following options set thresholds
; max_ - upper threshold
; min_ - lower threshold
; same options with sc_ prefix - for scaffolds
; e.g.
min_n50 1000
max_n50 1200

Keeping tests up to date

If the results of some test are modified by certain code changes, and these changes are approved, these tests need to be updated. All following commands must be run only on the CabCity server under the teamcity user.

Updating checkpoint tests

  • Go to the respective folder containing etalon checkpoints (i.e. /Nancy/teamcity/etalon_saves/SAVES_ECOLI_UCSD_L1)
  • Run ./copy_etalon.sh /Nancy/teamcity/output/SAVES_ECOLI_UCSD_L1/ etalon_saves_DD.MM.YY/
  • Update symlink /make_etalon.sh etalon_saves_DD.MM.YY/

Updating assembly tests

Thresholds in configuration files can be updated automatically using the last reliable contigs in contig archive. Reliable contigs/scaffolds have a symlink latest_contigs/latest_scaffolds pointing to them. To update a symlink:

  • Make sure the test is investigated and the last contigs/scaffolds are now considered reliable;
  • Run src/test/teamcity/force_update_symlinks.py <archive dir>, where <archive dir> is a folder containing contigs for this specific test;
  • Note, this script runs recursively for all subfolders allowing to update multiple symlinks in a single run.

Once symlinks are updated, you can run src/test/teamcity/update_threshold.py to update all thresholds. This script runs QUAST using latest_contigs/latest_scaffolds symlinks and sets new thresholds values within the 10% range.

You may provide either a .info configuration file with -c option, or a folder containing multiple configuration files with -d option. In the later case the script will run recursively on all subfolders. It is recommended to set --overwrite_if_satisfies option to update all thresholds, even the ones that are not violated by the new SPAdes version.

Cleaning up contig archive

As the contig archive grows, one needs to clean it time to time using

src/test/teamcity/clean_old_contigs.py <archive folder> <number of days>

The second option sets how old the file should be (in days) in order to be removed. This script is also recursive.

Creating new tests

All new tests must be created using the following guidelines:

  • Create build configurations within the appropriate TeamCity project;
  • Use test templates if possible (see section below);
  • Use informative naming that will contain a tested feature/mode and dataset being used (if any);
  • Place .info configuration file in the appropriate folder in /Nancy/teamcity/build_configurations/;
  • Run test after creating it, set up appropriate thresholds manually or automatically (see previous section).

It is recommended to create new TeamCity build configurations by copying and modifying the existing ones.

TeamCity configuration templates

TeamCity allows you to create new build configurations by using templates, which contain the main options, such as version control settings, agent requirements and even common line options. To use a template, you need to set “Based on template” options when creating new build configuration. However, template settings can be overridden by test options. Learn more here.

For example, the main template for creating SPAdes result tests is SPAdes_Launch_From_Info. This template runs a test using .info file named identical to the build configuration name (i.e. /Nancy/teamcity/build_configurations/spades_tests/%system.teamcity.buildConfName%.info).

Investigating failed tests

To investigate a failed test it can be useful to:

  • Look at the Build Log tab and check for error messages;
  • Check Contigs report and Scaffolds report tabs to investigate QUAST reports;
  • Reproduce tests in your repository by either manually running the same command or using teamcity.py script exactly as launched teamcity (see below).

Running teamcity.py

To run teamcity.py and reproduce the test, provide the respective .info configuration file. When running this script manually, you must set a custom output folder with -o option and set --no_contig_archive. Also, make sure all paths specified in the configuration file are available.

Other useful options: --run_name output dir custom suffix

--spades_path / -p custom path to spades.py

--no_cfg_and_compilation don't copy configs or compile SPAdes even if configuration file file states otherwise