Skip to content

MiXCR v4.5.0

Compare
Choose a tag to compare
@github-actions github-actions released this 22 Sep 12:58

🚀 New features

Multi-chain clone assembly for single-cell data

Now MiXCR calculates Heavy-Light antibody and Alpha-Beta and Gamma-Delta TCR combined clones for single-cell data. Two new commands were introduced to enable this functionality:

  • groupClones: calculates multi-chain clones from assembled clonotypes and writes result in a binary format;
  • exportCloneGroups: export information about combined clonotypes.

All single-cell presets now automatically produce combined multi-chain output in both binary and textual formats, see files with names matching *.clone.groups.tsv pattern in the output folder.

New characteristics in clonotype export

  • Export biochemical properties of gene regions with -biochemicalProperty <geneFeature> <property> or -baseBiochemicalProperties <geneFeature> export options. Available in export for alignments, clones and SHM tree nodes. Available properties: Hydrophobicity, Charge, Polarity, Volume, Strength, MjEnergy, Kf1, Kf2, Kf3, Kf4, Kf5, Kf6, Kf7, Kf8, Kf9, Kf10, Rim, Surface, Turn, Alpha, Beta, Core, Disorder, N2Strength, N2Hydrophobicity, N2Volume, N2Surface.
  • Export isotype with -isotype [<(primary|subclass|auto)>]
  • Export -mutationRate [<gene_feature>] in exportShmTreesWithNodes, exportClones and exportCloneGroups command: number of mutations relative to corresponding germline divided by the target sequence size. For exportClones and exportCloneGroups CDR3 is not included in calculation.

Support for wider set of input formats

  • Support for cram files as input for analyze and align commands. Optionally, a reference to the genome can be specified by --reference-for-cram
  • Fixed usage of BAM input for analyze and align, if file contains both paired and single reads

Algorithm enhancements

  • Global consensus assembly algorithm, applied in assemble to collapse UMI/Cell groups into contigs, now have much better seed selection empirical step for multi-consensus assembly scenarios. This significantly increases sensitivity during assembly of secondary consensuses from the same group of sequences.
  • New constrain in low-quality reads mapping procedure preventing cross-cell read mapping.

📚 Preset updates

  • Additional improvement of clone filters in 10x-sc-xcr-vdj preset.
  • Tag pattern upgrade for cellecta-human-rna-xcr-umi-drivermap-air. Now UMI includes a part of the C-gene primer to increase diversity, and R2 is also used for payload.
  • Assembling feature fix for irepertoire-human-rna-xcr-repseq-plus preset. Now {CDR2Begin:FR4End}.
  • New preset for BD full-length protocol with enhanced beads V2 featuring B384 whitelists: bd-sc-xcr-rhapsody-full-length-enhanced-bead-v2.
  • New preset for Takara Bio SMART-Seq Mouse TCR (with UMIs): takara-mouse-rna-tcr-umi-smarseq.
  • Presets for new Cellecta kits: cellecta-human-dna-xcr-umi-drivermap-air, cellecta-human-rna-xcr-full-length-umi-drivermap-air, cellecta-mouse-rna-xcr-umi-drivermap-air.
  • Presets for iRepertoire RepSeq+ kits with UMI: irepertoire-mouse-rna-xcr-repseq-plus-umi-pe, irepertoire-human-rna-xcr-repseq-plus-umi-se,irepertoire-human-rna-xcr-repseq-plus-umi-pe.
  • isotype field added to exportClones for presets supporting isotype identification.
  • Split by C-gene enabled in thermofisher-human-rna-igh-oncomine-lr and cellecta-human-rna-xcr-umi-drivermap-air presets to facilitate isotype separation.
  • Default consensus assembly parameters maxNormalizedAlignmentPenalty and altSeedPenaltyTolerance are adjusted to increase sensitivity.
  • The --split-by-sample option is now set to true by default for all align presets, as well as all presets that inherit from it. This new default behavior applies unless it is directly overridden in the preset or with --dont-split-by-sample mix-in.
  • exportAlignments now reports UMI and/or Cell barcodes by default for presets with barcodes.

🛠️ Minor improvements & fixes

  • Fixed possible crash with --dry-run option in analyze
  • More informative help message that appears when using a deprecated preset and incorrectly suggests using --assemble-contigs-by instead of --assemble-clonotypes-by.
  • When split-by-tags is enabled, exportClone and exportShmTreesWithNodes now output read count as the sum of reads for given tags selection, more complicated formula was used in previous versions
  • exportAlignments by default now include the column topChains. exportClones function reports topChains for single cell presets.
  • Fixed calculation of geneFamilyName for genes like IGHA*00 (without the number before * symbol)
  • Better formatting in listPresets command. Added grouping by vendor, labels and optional filtering
  • Validation of input types in align or analyze by given tag pattern