Skip to content

Commit

Permalink
more comments addressed
Browse files Browse the repository at this point in the history
  • Loading branch information
andrewprzh committed May 21, 2024
1 parent c09226d commit 79c799b
Show file tree
Hide file tree
Showing 5 changed files with 19 additions and 10 deletions.
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,11 @@ Besides, SPAdes package includes supplementary tools for efficient k-mer countin

- Complete user manual can be found [here](https://ablab.github.io/spades/). Information below is provided merely for your convenience and cannot be considered as the user guide.

- SPAdes is an assembler for second-generation sequencing data (Illumina or IonTorrent). PacBio and Nanopore reads are supported *only* as supplementary data. SPAdes can assemble genomes, metagenomes, transcriptomes, viral genomes etc.
- SPAdes assembler supports:
- Assembly of second-generation sequencing data (Illumina or IonTorrent);
- PacBio and Nanopore reads that are used as supplementary data only.

- SPAdes allows to assemble genomes, metagenomes, transcriptomes, viral genomes etc.

- Download SPAdes binaries for Linux or MacOS [here](https://github.com/ablab/spades/releases/latest/). You can also compile SPAdes from [source](https://github.com/ablab/spades/releases/latest/) (requires g++ 9.0+, cmake 3.16+, zlib and libbz2). SPAdes requires only Python 3.8+ to be installed.

Expand Down
2 changes: 1 addition & 1 deletion docs/datatypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

### Isolated and multi-cell datasets

When assembling conventional multi-cell and bacterial isolated datasets with decent coverage (say 50x or higher), we strongly recommend to use `--isolate` option.
When assembling multi-cell and bacterial isolated datasets with decent coverage (say 50x or higher), we strongly recommend to use `--isolate` option.

SPAdes is capable of detecting optimal k-mer sizes automatically. Thus, if the assembly went smoothly without any errors or warnings, there is nothing to worry about.
For example, for read length 100bp the default k values are 21, 33, 55; for 150bp reads SPAdes uses k-mer sizes 21, 33, 55, 77; and for 250bp reads six iterations are used by default: 21, 33, 55, 77, 99, 127.
Expand Down
12 changes: 9 additions & 3 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
# Quick start

SPAdes is an assembler for second-generation sequencing data (Illumina or IonTorrent). PacBio and Nanopore reads are supported *only* as supplementary data. SPAdes can assemble genomes, metagenomes, transcriptomes, viral genomes etc.

- SPAdes assembler supports:
- Assembly of second-generation sequencing data (Illumina or IonTorrent);
- PacBio and Nanopore reads that are used as supplementary data only.

- SPAdes allows to assemble genomes, metagenomes, transcriptomes, viral genomes etc.


1. Download SPAdes binaries for
[Linux](https://github.com/ablab/spades/releases/latest/) or [MacOS](https://github.com/ablab/spades/releases/latest/).
Expand Down Expand Up @@ -59,13 +65,13 @@ bin/spades.py --rnaviral -1 left.fastq.gz -2 right.fastq.gz -o output_folder

## Available assembly modes

- `--isolate` - converntional bacterial data;
- `--isolate` - isolate (standard) bacterial data;

- `--sc` - single-cell bacterial data;

- `--meta` - metagenome assembly;

- `--plasmid` / `--metaplasmid` - plasmid discovery in conventional bacterial / metagenomic data;
- `--plasmid` / `--metaplasmid` - plasmid discovery in standard bacterial / metagenomic data;

- `--metaviral` - viral assembly from metagenomic data;

Expand Down
5 changes: 2 additions & 3 deletions docs/input.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# SPAdes basic input

SPAdes takes as input paired-end reads, mate-pairs and single (unpaired) reads in FASTA and FASTQ. For IonTorrent data SPAdes also supports unpaired reads in unmapped BAM format (like the one produced by Torrent Server). However, in order to run read error correction, reads should be in FASTQ or BAM format. Sanger, Oxford Nanopore and PacBio CLR reads can be provided in both formats since SPAdes does not run error correction for these types of data.
SPAdes takes as input paired-end reads, mate-pairs and single (unpaired) reads in FASTA and FASTQ (can be gzipped). For IonTorrent data SPAdes also supports unpaired reads in unmapped BAM format (like the one produced by Torrent Server). However, in order to run read error correction, reads should be in FASTQ or BAM format. Sanger, Oxford Nanopore and PacBio CLR reads can be provided in both formats since SPAdes does not run error correction for these types of data.

To run SPAdes you need at least one library of the following types:

Expand All @@ -14,10 +14,9 @@ SPAdes supports mate-pair only assembly. However, we recommend to use only high-

Notes:

- It is strongly suggested to provide multiple paired-end and mate-pair libraries according to their insert size (from smallest to longest).
- It is strongly recommended to provide multiple paired-end and mate-pair libraries according to their insert size (from smallest to longest).
- It is not recommended to run SPAdes on PacBio reads with low coverage (less than 5).
- We suggest not to run SPAdes on PacBio reads for large genomes.
- SPAdes accepts gzip-compressed files.

## Paired read libraries

Expand Down
4 changes: 2 additions & 2 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,13 +108,13 @@ subset of SPAdes components. The components are:
- `pathracer`
- `spaligner`

By default only SPAdes and SPAdes tools are enabled (so
By default, only SPAdes and SPAdes tools are enabled (so
`-DSPADES_ENABLE_PROJECTS="spades;spades_tools"` is the default). Alternatively,
one can simply enable building everything via specifying `SPADES_ENABLE_PROJECTS="all"`.

## Verifying your installation

For testing purposes, SPAdes comes with a toy data set (reads that align to first 1000 bp of *E. coli*). To try SPAdes on this data set, run:
For testing purposes, SPAdes comes with a toy data set (reads that align to the first 1000 bp of [*E. coli*](https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000005845.2/)). To try SPAdes on this data set, run:

``` bash
<spades installation dir>/bin/spades.py --test
Expand Down

0 comments on commit 79c799b

Please sign in to comment.