Skip to content

Releases: jmschrei/tangermeme

v0.4.0 release

10 Oct 10:48
bfb5212
Compare
Choose a tag to compare

v0.4.0 contains greatly expanded functionality for distilling what models have learned about cis-regulation after they have been trained. This release includes a new seqlet caller that is optimized for speed and simplicity, functionality for mapping seqlets (from the new caller or from the tfmodisco caller) to a database of motifs using TOMTOM, and for calculating statistics about these annotations, and more. Please see the additional tutorials and vignette for concrete examples.

v0.3.0

06 Sep 15:19
Compare
Choose a tag to compare

This release primarily contains functionality for the FIMO and TOMTOM algorithms. These algorithms, respectively, scan PWMs across long sequences to identify matches, and scan PWMs against each other to identify similar motifs. These two algorithms are invaluable for annotating results -- whether it's getting a sense for what motifs are in sequences when dissecting them using machine learning models, or identifying what motifs are present at seqlets. Hence, significant amounts of time have been put into optimizing the speed and memory efficiency of them.

These algorithms can be imported via

from tangermeme.tools.fimo import fimo

and

from tangermeme.tools.tomtom import tomtom

respectively. Each is a function that takes a set of motifs, and their second input, and some optional parameters, and returns the results.

Additionally, these functions make up the core of two command-line utilities that implement the algorithms and yield near-identical results to the MEME suite commands in only a fraction of the time. These can be used on the command-line via

tangermeme fimo -m <motif file> -s <sequence file> ...

and

tangermeme tomtom -q <motif file 1> -t <motif file 2> ...

However, sometimes you only have a hard sequence and you want to match it to a motif (like when, for instance, you're manually looking at motifs and want to get a sense for what the contiguous span of characters with high attributions are). Optionally, you can do something like

tangermeme tomtom -q ACGTTG -t <motif file 2>

or even

tangermeme tomtom -q ACGTTG

In the second case, the JASPAR 2024 motif set will be automatically downloaded and used in the future.

v0.2.0

08 May 14:55
Compare
Choose a tag to compare

This release focuses on attribution methods and other utilities based on better using these attributions.

  • Stand-alone implementation of DeepLIFT/SHAP that fixes several small issues with the Captum implementation; bugs, improved batching and throughput for small models, etc.
  • Seqlet calling for identifying spans of high-attribution signal
  • Plotting and annotations of those plots
  • Altered API for saturation_mutagenesis to be more consistent with the results from deep_lift_shap.
  • Altered API for several functions to allow the passing in of a custom function to apply, e.g., for marginalize to allow passing in either predict or deep_lift_shap (or any other function) to apply before and after substituting in a motif of interest