-
Notifications
You must be signed in to change notification settings - Fork 12
Ray v1.7.1 rc1
NAME Ray - assemble genomes in parallel using the message-passing interface
SYNOPSIS mpiexec -np NUMBER_OF_RANKS Ray -k KMERLENGTH -p l1_1.fastq l1_2.fastq -p l2_1.fastq l2_2.fastq -o test
DESCRIPTION:
-help
Displays this help page.
-version
Displays Ray version and compilation options.
K-mer length
-k kmerLength
Selects the length of k-mers. The default value is 21.
It must be odd because reverse-complement vertices are stored together.
The maximum length is defined at compilation by MAXKMERLENGTH
Larger k-mers utilise more memory.
Inputs
-p leftSequenceFile rightSequenceFile [averageOuterDistance standardDeviation]
Provides two files containing paired-end reads.
averageOuterDistance and standardDeviation are automatically computed if not provided.
-i interleavedSequenceFile [averageOuterDistance standardDeviation]
Provides one file containing interleaved paired-end reads.
averageOuterDistance and standardDeviation are automatically computed if not provided.
-s sequenceFile
Provides a file containing single-end reads.
Outputs
-o outputDirectory
Specifies the directory for outputted files. Default is RayOutput
-amos
Writes the AMOS file called RayOutput/AMOS.afg
An AMOS file contains read positions on contigs.
Can be opened with software with graphical user interface.
-write-kmers
Writes k-mer graph to RayOutput/kmers.txt
The resulting file is not utilised by Ray.
The resulting file is very large.
-write-seeds
Writes seed DNA sequences to RayOutput/Rank<rank>.RaySeeds.fasta
-write-extensions
Writes extension DNA sequences to RayOutput/Rank<rank>.RayExtensions.fasta
-write-contig-paths
Writes contig paths with coverage values
to RayOutput/Rank<rank>.RayContigPaths.txt
Memory usage
-show-memory-usage
Shows memory usage. Data is fetched from /proc on GNU/Linux
Needs __linux__
-show-memory-allocations
Shows memory allocation events
Algorithm verbosity
-show-extension-choice
Shows the choice made (with other choices) during the extension.
-show-ending-context
Shows the ending context of each extension.
Shows the children of the vertex where extension was too difficult.
-show-distance-summary
Shows summary of outer distances used for an extension path.
Assembly options (defaults work well)
-color-space
Runs in color-space
Needs csfasta files. Activated automatically if csfasta files are provided.
-minimumCoverage minimumCoverage
Sets manually the minimum coverage.
If not provided, it is computed by Ray automatically.
-peakCoverage peakCoverage
Sets manually the peak coverage.
If not provided, it is computed by Ray automatically.
-repeatCoverage repeatCoverage
Sets manually the repeat coverage.
If not provided, it is computed by Ray automatically.
Checkpointing
-write-checkpoints
Write checkpoint files
-read-checkpoints
Read checkpoint files
-read-write-checkpoints
Read and write checkpoint files
Hardware testing
-test-network-only
Test the network and return. This option enables -write-network-test-raw-data.
-write-network-test-raw-data
Writes one additional file per rank detailing the network test.
Debugging
-run-profiler
Runs the profiler as the code runs.
Running the profiler increases running times.
-show-communication-events
Shows all messages sent and received.
-show-read-placement
Shows read placement in the graph during the extension.
-debug-bubbles
Debugs bubble code.
Bubbles can be due to heterozygous sites or sequencing errors or other (unknown) events
-debug-seeds
Debugs seed code.
Seeds are paths in the graph that are likely unique.
-debug-fusions
Debugs fusion code.
-debug-scaffolder
Debug the scaffolder.
FILES
Input files
Note: file format is determined with file extension.
.fasta
.fasta.gz (needs HAVE_LIBZ=y at compilation)
.fasta.bz2 (needs HAVE_LIBBZ2=y at compilation)
.fastq
.fastq.gz (needs HAVE_LIBZ=y at compilation)
.fastq.bz2 (needs HAVE_LIBBZ2=y at compilation)
.sff (paired reads must be extracted manually)
.csfasta (color-space reads)
Outputted files
Scaffolds
RayOutput/Scaffolds.fasta
The scaffold sequences in FASTA format
RayOutput/ScaffoldComponents.txt
The components of each scaffold
RayOutput/ScaffoldLengths.txt
The length of each scaffold
RayOutput/ScaffoldLinks.txt
Scaffold links
Contigs
RayOutput/Contigs.fasta
Contiguous sequences in FASTA format
RayOutput/ContigLengths.txt
The lengths of contiguous sequences
Summary
RayOutput/OutputNumbers.txt
Overall numbers for the assembly
de Bruijn graph
RayOutput/CoverageDistribution.txt
The distribution of coverage values
RayOutput/CoverageDistributionAnalysis.txt
Analysis of the coverage distribution
RayOutput/degreeDistribution.txt
Distribution of ingoing and outgoing degrees
RayOutput/kmers.txt
k-mer graph, required option: -write-kmers
The resulting file is not utilised by Ray.
The resulting file is very large.
Assembly steps
RayOutput/SeedLengthDistribution.txt
Distribution of seed length
RayOutput/Rank<rank>.RaySeeds.fasta
Seed DNA sequences, required option: -write-seeds
RayOutput/Rank<rank>.RayExtensions.fasta
Extension DNA sequences, required option: -write-extensions
RayOutput/Rank<rank>.RayContigPaths.txt
Contig paths with coverage values, required option: -write-contig-paths
Paired reads
RayOutput/LibraryStatistics.txt
Estimation of outer distances for paired reads
RayOutput/Library<LibraryNumber>.txt
Frequencies for observed outer distances (insert size + read lengths)
Partition
RayOutput/NumberOfSequences.txt
Number of reads in each file
RayOutput/SequencePartition.txt
Sequence partition
Ray software
RayOutput/RayVersion.txt
The version of Ray
RayOutput/RayCommand.txt
The exact same command provided
AMOS
RayOutput/AMOS.afg
Assembly representation in AMOS format, required option: -amos
Communication
RayOutput/MessagePassingInterface.txt
Number of messages sent RayOutput/NetworkTest.txt Latencies in microseconds RayOutput/RankNetworkTestData.txt Network test raw data
DOCUMENTATION
This help page (always up-to-date)
Manual (Portable Document Format): InstructionManual.pdf
Mailing list archives: http://sourceforge.net/mailarchive/forum.php?forum_name=denovoassembler-users
AUTHOR Written by Sébastien Boisvert.
REPORTING BUGS Report bugs to denovoassembler-users@lists.sourceforge.net Home page: http://denovoassembler.sourceforge.net/
COPYRIGHT This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You have received a copy of the GNU General Public License
along with this program (see LICENSE).
Ray 1.7-devel