Skip to content

Releases: compomics/ms2pip

v3.10.0

01 Feb 12:38
d9bc351
Compare
Choose a tag to compare

Added

  • Added support for mzML spectrum files (both for evaluating models and for extracting feature vectors).
  • New argument spectrum_id_pattern: Regular expression pattern to apply to spectrum titles before matching to peptide file entries.
  • When using MS²PIP as class instance, the resulting pred_and_emp dataframe can also be returned (instead of writing to a file) when setting return_results to True.
  • If requested, retention time prediction with DeepLC is now also enabled if spectrum file is given. This feature was previously only enabled if only a peptide file was given.

Changed

  • Improved logging: Use Rich library for logging, show time stamps and message log levels.
  • MS²PIP now shows a progress bar instead of a wall of text to display prediction progress.
  • fasta2speclib: Improved algorithm for variable modification assignment. Combinatorial explosion from variable modifications is now reduced by setting a maximum of modified residues per peptide, instead of arbitrarily selecting a maximum of potentially modified sites per peptide.
  • Update README.md (Switch from BadGen to Shields.io).
  • Switch to Pyteomics MGF reader.
  • Avoid SciPy dependency.
  • More optimal use of Numpy in calc_correlations.
  • Remove poetry.lock (not used, avoid unneeded Dependabot PRs).

Fixed

  • Vastly improved computational speed and reduced memory usage when using XGBoost model files for prediction in combination with providing a spectrum file (XGB prediction step is now moved out of multiprocessing).
  • For optimal performance, feature vectors for predictions from XGBoost model files now also uses the traditional ms2pipC.py multiprocessing system.
  • fasta2speclib: Fixed issue where modified versions of peptide were duplicated.
  • spectrum_output: Various fixes in MSP spectral library file writing for DIA-NN compatibility: Write m/z error of 0.0 for each predicted peak in peak annotation string, ensure modifications in MSP Mods field are sorted by position, use RetentionTime instead of RTINSECONDS in comments field.
  • Fixed double spectrum_utils entry in requirements.
  • Updated python_requires to minimal 3.7, following previously updated test grid.
  • Fix spectrum_utils modification off-by-one bug (had no consequences except for plot annotations).
  • Fix typo in write_amino_acid_masses function name.
  • Fix missing comma in the setup.py.

Removed

  • Removed unsupported Tableau output file option

v3.9.0

12 Mar 20:00
Compare
Choose a tag to compare

New and improved 🚀

  • New prediction model for CID-TMT: TMT-labelled peptide spectra acquired on ion trap (trap-type CID), often used for "MultiNotch MS3" (https://dx.doi.org/10.1021/ac502040v) (PR #157)
  • Support for Python 3.9 and 3.10; dropped support for end-of-life Python 3.6 (PR #156, fixes #126)
  • Support for alternative cleavage rules (digestion enzymes) in fasta2speclib (PR #166, fixes #96)

Bugfixes 🐛

  • Fixed missing support for XGBoost models in single-prediction mode (PR #157, fixes #155)
  • Use oldest-supported-numpy for build in CI testing (PR #157)

Refactoring and minor changes 🔧

  • Replaced C models files with their XGBoost counterpart (except for HCD2019 and TMT): Faster compilation, smaller Python package (PR #157)
  • Add model_dir option to set custom directory for model downloads (CLI, single-prediction CLI, Python API) (PR #169, fixes #165)
  • Add docstring to MS2PIP class and add example to README.md (PR #167, fixes #131)
  • Relaxed click version requirements (PR #157, fixes #158)
  • Removed XGBoost warnings from the CLI output (PR #157)
  • Various fasta2speclib improvements (PR #166)
    • Add deeplc option to default config
    • Suppress tensorflow warnings
    • Replace deprecated pandas append with concat
  • Add missing sptm and gptm to example config.toml (#167)

New prediction models

Model Current version Train-test dataset (unique peptides) Evaluation dataset (unique peptides) Median Pearson correlation on evaluation dataset
CID-TMT v20220104 [in-house dataset] (72 138) PXD005890 (69 768) 0.851085

v3.8.0

14 Nov 22:42
a383bc2
Compare
Choose a tag to compare

New and improved 🚀

  • New models for non-tryptic peptides and immunopeptides! (PR #137)
    Checkout our preprint for more info: https://doi.org/10.1101/2021.11.02.466886
  • Support for Windows! Just run pip install ms2pip in your Windows terminal, and start predicting. (PR #151)

Bugfixes 🐛

  • In DLIB output, a value is now written to the isDecoy column. Fixes downstream readout of protein information. (#140, PR #152)

Refactoring and minor changes 🔧

  • Implementation of .xgboost model files directly is now supported, no dump to C and compilation required. (PR #137)

New prediction models

Model Current version Train-test dataset (unique peptides) Evaluation dataset (unique peptides) Median Pearson correlation on evaluation dataset
HCD2021 v20210416 [Combined dataset] (520 579) PXD008034 (35 269) 0.932361
Immuno-HCD v20210316 [Combined dataset] (460191) PXD005231 (HLA-I) (46 753)
PXD020011 (HLA-II) (23 941)
0.963736
0.942383

v3.7.1

13 Sep 11:36
bdf15d8
Compare
Choose a tag to compare

Fixed:

  • Pin NumPy version used during build to fix compatibility with older NumPy versions (PR #148)

v3.7.0

09 Sep 09:33
2234d24
Compare
Choose a tag to compare

New:

  • Command to predict and plot a single spectrum (PR #136)

Improved:

  • fasta2speclib improvements (#135)
    • Pass through options from config file to DeepLC (fixes #138)
    • Pass num_cpu to DeepLC, either from the deeplc section in the configuration, or from the num_cpu option in the fasta2speclib configuration

Fixed:

  • Parse modifications on L (#144, PR #145)

3.6.3

25 Jan 12:51
b7df0fc
Compare
Choose a tag to compare

New:

  • Python 3.9 support (PR #122)

New (also published for 3.6.2):

Fixed:

  • MS²PIP now exits on incorrectly configured or unknown modifications, instead of only showing a warning. (#100, PR #101)
  • Parsing of C-terminal modifications from a txt config file was broken in v3.6.2. This is now fixed (PR #109)
  • The example fasta2speclib configuration file erroneously contained average mass shifts, which has now been updated to the respective monoisotopic mass shifts. (PR #121)
  • If a critical error occurs, MS²PIP now exits with status code 1. (#102, PR #123)
  • Supported config file extensions are now described in help message and error message (#125, PR #129)

v3.6.2

08 May 08:40
9c3eb93
Compare
Choose a tag to compare

Fixed in this release:

  • Fixes in logging formatting (#64, #65)
  • Use float formatting in CSV output
  • Retention time predictions can also be added without writing output to file
  • When MS²PIP is running in a daemon process, it will not attempt to use multiprocessing
  • Various improvments in match_spectra functionality (e.g. sqldb-backend, output, ...)
  • General cleanup of repository (e.g. unused models)

v3.6.1

01 Apr 12:38
fa44847
Compare
Choose a tag to compare

New in this release:

  • Small fix in fasta2speclib parameter handing
  • New option save_peprec in fasta2speclib to save peprec files (including DeepLC predictions, if present)

v3.6.0

30 Mar 16:33
65fc0ba
Compare
Choose a tag to compare

New since previous release:

  • DeepLC integration! Predict spectral libraries with accurate LC retention time prediction, even for modified peptides. Enable DeepLC with the -r flag in MS²PIP or by adding "add_retention_time":true to the fasta2speclib configuration.
  • Additional support for TOML-based configuration files: see config.toml example
  • New Skyline .blib to PEPREC and MGF converter script in conversion_tools
  • Various under-the-hood improvements

Includes the following models:

Model Current version Train-test dataset (unique peptides) Evaluation dataset (unique peptides) Median Pearson correlation on evaluation dataset
HCD v20190107 MassIVE-KB (1 623 712) PXD008034 (35 269) 0.903786
CID v20190107 NIST CID Human (340 356) NIST CID Yeast (92 609) 0.904947
iTRAQ v20190107 NIST iTRAQ (704 041) PXD001189 (41 502) 0.905870
iTRAQphospho v20190107 NIST iTRAQ phospho (183 383) PXD001189 (9 088) 0.843898
TMT v20190107 Peng Lab TMT Spectral Library (1 185 547) PXD009495 (36 137) 0.950460
TTOF5600 v20190107 PXD000954 (215 713) PXD001587 (15 111) 0.746823
HCDch2 v20190107 MassIVE-KB (1 623 712) PXD008034 (35 269) 0.903786 (+) and 0.644162 (++)
CIDch2 v20190107 NIST CID Human (340 356) NIST CID Yeast (92 609) 0.904947 (+) and 0.813342 (++)

v3.5.1

04 Mar 16:46
954e5fa
Compare
Choose a tag to compare

New since previous release:

  • Hotfix: add header files to manifest