Skip to content

Latest commit

 

History

History
85 lines (55 loc) · 2.93 KB

README.md

File metadata and controls

85 lines (55 loc) · 2.93 KB

Arrow

This repository archives the pipelines and source codes used in the LongBow manuscript.

Contents

The contents are organized into six main folders. Please feel free to click on any title to view the detailed README.md.

  1. SRA_status

    Assessment of the availability and metadata labeling of Oxford Nanopore sequencing data in the SRA database

  2. ONTsoftware_misuse_configs

    Benchmark of Clair3, Shasta, Medaka with correct or wrong configs/models

  3. LongBow_training

    Instructions for training the LongBow model using Nanopore raw data from various model organisms. Details on the training data and how to perform a leave-one-out test to determine the best lag for autocorrelation analysis.

  4. LongBow_testing

    Instruction for testing LongBow on 66 independent groups of ONT data and human ONT SRA data.

  5. COGUK

    Repart and reanalysis of COGUK SARS-CoV-2 data

  6. Others

    Other pipelines in the manuscript

Requirement

OS requirement

Codes were tested on Linux operating systems. The following release is tested. Linux: Redhat Enterprise Linux 8 Linux: Ubuntu 22.04.1

Software requirement

Conda version

Most of the following softwares are installed through Conda environment. We have run test on Conda version 24.1.2 and version 24.4.0. We strongly recommend installing Conda version >= 24.1.x.

You can follow the Conda manual in here to install Conda.

Programming language

To run the Python scripts we provided, Python 3.7 or a higher version is required.

Software version list

Software Version
Artic 1.2.4
bcftools 1.19
Bioawk 20110810
Chopper 0.7.0
Clair3 1.0.4, 1.0.10
Flye 2.9.3-b1797
hdf5 1.12.1
Medaka 1.11.3
Minimap2 2.26-r1175, 2.28-r1209
ont-fast5-api 4.1.1
pod5 0.2.4
seqtk 1.3-r106
Shasta 0.11.1
yak 0.1-r56