Skip to content
Anuradha Wickramarachchi edited this page Aug 27, 2024 · 11 revisions
$$\   $$\                                   $$$$$$$$\                     $$\           
$$ | $$  |                                  \__$$  __|                    $$ |          
$$ |$$  / $$$$$$\$$$$\   $$$$$$\   $$$$$$\     $$ |    $$$$$$\   $$$$$$\  $$ | $$$$$$$\ 
$$$$$  /  $$  _$$  _$$\ $$  __$$\ $$  __$$\    $$ |   $$  __$$\ $$  __$$\ $$ |$$  _____|
$$  $$<   $$ / $$ / $$ |$$$$$$$$ |$$ |  \__|   $$ |   $$ /  $$ |$$ /  $$ |$$ |\$$$$$$\  
$$ |\$$\  $$ | $$ | $$ |$$   ____|$$ |         $$ |   $$ |  $$ |$$ |  $$ |$$ | \____$$\ 
$$ | \$$\ $$ | $$ | $$ |\$$$$$$$\ $$ |         $$ |   \$$$$$$  |\$$$$$$  |$$ |$$$$$$$  |
\__|  \__|\__| \__| \__| \_______|\__|         \__|    \______/  \______/ \__|\_______/ 

There are several commands from KmerTools to help you build your computational pipeline. To get started run the help command.

kmetools --help

You should see the following output

kmertools: DNA vectorisation

k-mer based vectorisation for DNA sequences for
metagenomics and AI/ML applications

Usage: kmertools <COMMAND>

Commands:
  comp  Generate sequence composition based features
  cov   Generates coverage histogram based on the reads
  min   Bin reads using minimisers
  ctr   Count k-mers
  help  Print this message or the help of the given subcommand(s)

Options:
  -h, --help
          Print help (see a summary with '-h')

  -V, --version
          Print version

Modes

  1. Composition computations (comp)
  2. Coverage computations (cov)
  3. Minimiser computations (min)
  4. K-mer counting (ctr)
  5. Python bindings (pykmertools)

Citations

If you use the tool, please use the CFF citation in GitHub from the main repository or use the following.

@software{Wickramarachchi_kmertools_DNA_Vectorisation,
  author = {Wickramarachchi, Anuradha and Mallawaarachchi, Vijini},
  title = {{kmertools: DNA Vectorisation Tool}},
  url = {https://github.com/anuradhawick/kmertools},
  version = {0.1.0}
}

Following is the citation for coverage histograms algorithm and original publication.

@article{wickramarachchi2020metabcc,
  title={Metabcc-lr: meta genomics b inning by c overage and c omposition for l ong r eads},
  author={Wickramarachchi, Anuradha and Mallawaarachchi, Vijini and Rajan, Vaibhav and Lin, Yu},
  journal={Bioinformatics},
  volume={36},
  number={Supplement\_1},
  pages={i3--i11},
  year={2020},
  publisher={Oxford University Press}
}

Following is the citation for minimisers.

@article{10.1093/bioinformatics/bth408,
  author = {Roberts, Michael and Hayes, Wayne and Hunt, Brian R. and Mount, Stephen M. and Yorke, James A.},
  title = "{Reducing storage requirements for biological sequence comparison}",
  journal = {Bioinformatics},
  volume = {20},
  number = {18},
  pages = {3363-3369},
  year = {2004},
  month = {07},
  issn = {1367-4803},
  doi = {10.1093/bioinformatics/bth408},
  url = {https://doi.org/10.1093/bioinformatics/bth408},
  eprint = {https://academic.oup.com/bioinformatics/article-pdf/20/18/3363/48906547/bioinformatics\_20\_18\_3363.pdf},
}

Following is the citation for the Chaos Game Representation.

@article{10.1093/nar/18.8.2163,
  author = {Jeffrey, H.Joel},
  title = "{Chaos game representation of gene structure}",
  journal = {Nucleic Acids Research},
  volume = {18},
  number = {8},
  pages = {2163-2170},
  year = {1990},
  month = {04},
  issn = {0305-1048},
  doi = {10.1093/nar/18.8.2163},
  url = {https://doi.org/10.1093/nar/18.8.2163},
  eprint = {https://academic.oup.com/nar/article-pdf/18/8/2163/7059915/18-8-2163.pdf},
}

Authors

Support and contributions

Please get in touch via author websites or GitHub issues. Thanks!