Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
cjain7 authored Jul 22, 2020
1 parent b8ac2ec commit 137e485
Showing 1 changed file with 9 additions and 9 deletions.
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,28 +13,28 @@ Winnowmap requires C++ compiler with c++11 and openmp, which are available by de
cd Winnowmap
make -j8
```
Expect `winnowmap` and `meryl` executables in `bin` folder. The `recursive` option used above is necessary to download all submodules .
Expect `winnowmap` and `meryl` executables in `bin` folder.

## Usage

For either mapping long reads or computing whole-genome alignments, Winnowmap requires pre-computing high frequency k-mers (e.g., top 0.02% most frequent) in a reference. Winnowmap uses [meryl](https://github.com/marbl/meryl) k-mer counting tool for this purpose.

* Mapping ONT or PacBio WGS reads
* Mapping ONT or PacBio-hifi WGS reads
```sh
meryl count k=15 output merylK15 ref.fa
meryl print greater-than distinct=0.9998 merylK15 > repetitiveK15.txt
meryl count k=15 output merylDB ref.fa
meryl print greater-than distinct=0.9998 merylDB > repetitive_k15.txt

winnowmap -W repetitiveK15.txt -t 36 -ax map-ont ref.fa ont.fq.gz > output.sam [OR]
winnowmap -W repetitiveK15.txt -t 36 -ax map-pb ref.fa hifi.fq.gz > output.sam
winnowmap -W repetitive_k15.txt -t 36 -ax map-ont ref.fa ont.fq.gz > output.sam [OR]
winnowmap -W repetitive_k15.txt -t 36 -ax map-pb ref.fa hifi.fq.gz > output.sam
```

* Mapping genome assemblies

```sh
meryl count k=19 output merylK19 asm1.fa
meryl print greater-than distinct=0.9998 merylK19 > repetitiveK19.txt
meryl count k=19 output merylDB asm1.fa
meryl print greater-than distinct=0.9998 merylDB > repetitive_k19.txt

winnowmap -W repetitiveK19.txt -t 36 -ax asm20 asm1.fa asm2.fa > output.sam
winnowmap -W repetitive_k19.txt -t 36 -ax asm20 asm1.fa asm2.fa > output.sam
```
Adjust the thread count `-t` based on your CPU. For the genome-to-genome use case, it may be useful to visualize the dot plot. This [perl script](https://github.com/marbl/MashMap/blob/master/scripts) can be used to generate a dot plot from [paf](https://github.com/lh3/miniasm/blob/master/PAF.md)-formatted output. In both usage cases, pre-computing repetitive k-mers using [meryl](https://github.com/marbl/meryl) is quite fast, e.g., it typically takes 2-3 minutes for the human genome reference.

Expand Down

0 comments on commit 137e485

Please sign in to comment.