-
Notifications
You must be signed in to change notification settings - Fork 134
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refine docs, separate out binspreader
- Loading branch information
Showing
4 changed files
with
195 additions
and
130 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
# Binning refining using assembly graphs | ||
|
||
BinSPreader is a tool that attempts to refine metagenome-assembled genomes | ||
(MAGs) obtained from existing tools. BinSPreader exploits the assembly graph | ||
topology and other connectivity information, such as paired-end and Hi-C reads, | ||
to refine the existing binning, correct binning errors, propagate binning from | ||
longer contigs to shorter contigs, and infer contigs belonging to multiple bins. | ||
|
||
The tool requires initial binning to refine, as well as an assembly graph as a | ||
source of information for refining. Optionally, BinSPreader can be provided with | ||
multiple Hi-C and/or paired-end libraries. | ||
|
||
## Command line options | ||
|
||
Required positional arguments: | ||
|
||
- Assembly graph file in [GFA 1.0 | ||
format](https://github.com/GFA-spec/GFA-spec/blob/master/GFA1.md), with | ||
scaffolds included as path lines. Alternatively, scaffold paths can be | ||
provided separately using `--path` option in the `.paths` format accepted by | ||
Bandage (see [Bandage | ||
wiki](https://github.com/rrwick/Bandage/wiki/Graph-paths) for details). | ||
- Binning output from an existing tool (in `.tsv` format) | ||
|
||
### Synopsis | ||
```bash | ||
binspreader <graph (in GFA)> <binning (in .tsv)> <output directory> [OPTION...] | ||
``` | ||
|
||
### Main options | ||
|
||
`--paths` | ||
provide contigs paths from file separately from GFA | ||
|
||
`--dataset` | ||
Dataset in [YAML format](running.md#specifying-multiple-libraries-with-yaml-data-set-file) describing Hi-C and paired-end reads | ||
|
||
`-t` | ||
Number of threads to use (default: 1/2 of available threads) | ||
|
||
`-m` | ||
Allow multiple bin assignment (default: false) | ||
|
||
`-Smax|-Smle` | ||
Simple maximum or maximum likelihood binning assignment strategy (default: max likelihood) | ||
|
||
`-Rcorr|-Rprop` | ||
Select propagation or correction mode (default: correction) | ||
|
||
`--cami` | ||
Use CAMI bioboxes binning format | ||
|
||
`--zero-bin` | ||
Emit zero bin for unbinned sequences | ||
|
||
`--tall-multi` | ||
Use tall table for multiple binning result | ||
|
||
`--bin-dist` | ||
Estimate pairwise bin distance (could be slow on large graphs!) | ||
|
||
`-la` | ||
Labels correction regularization parameter for labeled data (default: 0.6) | ||
|
||
|
||
### Output | ||
BinSPreader stores all output files in the output directory `<output_dir> ` set by the user. | ||
|
||
- `<output_dir>/binning.tsv` contains refined binning in `.tsv` format | ||
- `<output_dir>/bin_stats.tsv` contains various per-bin statistics | ||
- `<output_dir>/bin_weights.tsv` contains soft bin weights per contig | ||
- `<output_dir>/edge_weights.tsv` contains soft bin weights per edge | ||
|
||
In addition | ||
|
||
- `<output_dir>/bin_dist.tsv` contains refined bin distance matrix (if `--bin-dist` was used) | ||
- `<output_dir>/bin_label_1.fastq, <output_dir>/bin_label_2.fastq` read set for bin labeled by `bin_label` (if `--reads` was used) | ||
- `<output_dir>/pe_links.tsv` list of paired-end links between assembly graph edges with weights (if `--debug` was used) | ||
- `<output_dir>/graph_links.tsv` list of graph links between assembly graph edges with weights (if `--debug` was used) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.