Step by step instructions to benchmark baseline (minimap2) and OpenOmics minimap2 (mm2-fast) on c5.12xlarge, c6i.16xlarge and m6i.16xlarge instances of AWS
Download reference genome
wget https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/references/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz
Link to download HG002 ONT Guppy 3.6.0 dataset:
https://precision.fda.gov/challenges/10/view
File name: HG002_GM24385_1_2_3_Guppy_3.6.0_prom.fastq.gz
Link to download HG002 HiFi 14kb-15kb dataset:
https://precision.fda.gov/challenges/10/view
File name: HG002_35x_PacBio_14kb-15kb.fastq.gz
Download HG002 CLR dataset from s3://giab/data_indexes/AshkenazimTrio/sequence.index.AJtrio_PacBio_MtSinai_NIST_subreads_fasta_10082018
Download hap2 assembly dataset-
wget https://zenodo.org/record/4393631/files/NA24385.HiFi.hifiasm-0.12.hap2.fa.gz
git clone https://github.com/lh3/minimap2.git -b v2.22
cd minimap2 && make
./minimap2 -ax [preset] [ref-seq] [read-seq] -t 48 > minimap2output
Example command for ONT HG002 dataset:
./minimap2 -ax map-ont GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz HG002_ONT.fastq -t 48 > minimap2output
git clone --recursive https://github.com/lh3/minimap2.git -b fast-contrib-v2.22 mm2-fast-contrib
cd mm2-fast-contrib && make multi
./build rmi.sh path-to-ref-seq <preset flags>
<preset flags> are as follows:
ONT: map-ont
HiFi: map-hifi
CLR: map-pb
Assembly: asm5
Example: Create OpenOmics minimap2 index for ONT datasets for GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz
./build rmi.sh GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz map-ont
./mm2-fast -ax [preset] [ref-seq] [read-seq] -t [num_threads] > mm2-fastoutput
Example command to run HG002 ONT dataset on c5.12xlarge
./mm2-fast -ax map-ont GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz HG002_ONT.fastq -t 48 > mm2-fastoutput
Example command to run HG002 ONT dataset on c6i.16xlarge or m6i.16xlarge
./mm2-fast -ax map-ont GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz HG002_ONT.fastq -t 64 > mm2-fastoutput