diff --git a/.info b/.info index b024045..66d1d9a 100644 --- a/.info +++ b/.info @@ -1,2 +1,2 @@ CURRENTVERSION=0.99 -NEWVERSION=https://github.com/V-Z/sondovac/releases/download/v0.95-beta/sondovac-0.99-rc.zip +NEWVERSION=https://github.com/V-Z/sondovac/releases/download/v0.99-rc/sondovac-0.99-rc.zip diff --git a/CHANGELOG b/CHANGELOG index a1ee86c..20d9797 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,12 +4,12 @@ Sondovač is a script to create orthologous low-copy nuclear probes from transcriptome and genome skim data for target enrichment. -Version 0.99 release candidate released 2015-12-07 +Version 0.99 release candidate released 2015-12-08 ================================================================================ * Fixed error with some input files for part B. * Finished colorization of command-line user interface. -* Added possibility minimal exon length of the loci. +* Added possibility to set minimal exon length of the loci. * Various fixes and UI enhancements. * Improved documentation. diff --git a/INSTALL b/INSTALL index 56d93a1..e896fe1 100644 --- a/INSTALL +++ b/INSTALL @@ -28,21 +28,21 @@ Sondovač will check if those programs are installed - available in the PATH you have those packages installed (in current versions), ensure their binaries are in PATH. This should not be a problem for basic tools available in any UNIX-based operating system, as basic installation usually contains all needed -tools. If you lack some of the required tools, the script will notify -you, and you will have to install them manually. If this will be needed, check -the documentation for your operating system. +tools. If you lack some of the required tools, the script will notify you, and +you will have to install them manually. If this will be needed, check the +documentation for your operating system. If required scientific programs are not installed, Sondovač will offer you installation. You can use precompiled binaries available together with the -script (this is the recommended option) or (sometimes) from the web. This is -the recommended way. In case you would like to compile required software +script (this is the recommended option) or (sometimes) from the web. This +is the recommended way. In case you would like to compile required software yourself, the script will guide you through this process. Anyway, this is recommended only for advanced users, as compilation might sometimes be very tricky. Users of Mac OS X can install those applications also using Homebrew (see http://brew.sh/). For compilation you need Apache Ant, GNU G++, GNU GCC, GIT, Java/OpenJDK, libpng developmental files, and zlib developmental files. -Ensure that you have those tools available - they should be readily available for -any UNIX-based operating system. +Ensure that you have those tools available - they should be readily available +for any UNIX-based operating system. sondovac_part_a.sh requires (and will install) the following software packages: * BLAT @@ -56,25 +56,27 @@ sondovac_part_b.sh requires (and will install) the following software packages: * CD-HIT * BLAT -Geneious is required for step 7 of the pipeline. See below, README and PDF manual for details. +Geneious is required for step 7 of the pipeline. See below, README and PDF +manual for details. The following UNIX tools are required to run Sondovač. They are usually readily available in UNIX systems (but see note for Mac OS X below), so there is usually no need to install them manually. The tools are awk, bc, bunzip2, cat, -cp, curl or wget, cut, dirname, echo, egrep, cd, g++, gcc, grep, gunzip, join, less, -lsb_release or python (for Linux), make, mkdir, perl, pkg-config, pwd, sed, sort, tar, tr, uname, -uniq, unzip, wc. +cp, curl or wget, cut, dirname, echo, egrep, cd, g++, gcc, grep, gunzip, join, +less, lsb_release or python (for Linux), make, mkdir, perl, pkg-config, pwd, +sed, sort, tar, tr, uname, uniq, unzip, wc. -For Mac OS X users, Homebrew (http://brew.sh/) will be installed by the script, and it will -install (new software or newer versions) Apache Ant, BASH (the shell interpreter), GNU AWK, GNU -coreutils, GNU GCC, git, GNU grep, GNU make, pkg-config, GNU sed, and wget. Mac -OS X is missing some tools and for others (typically sed, grep or awk) contains -too old BSD versions. The script will guide the user through the process, and if the -user would wish, it is possible safely and easily remove these tools afterwards. +For Mac OS X users, Homebrew (http://brew.sh/) will be installed by the script, +and it will install (new software or newer versions) Apache Ant, BASH (the +shell interpreter), GNU AWK, GNU coreutils, GNU GCC, git, GNU grep, GNU make, +pkg-config, GNU sed, and wget. Mac OS X is missing some tools and for others +(typically sed, grep or awk) contains too old BSD versions. The script will +guide the user through the process, and if the user would wish, it is possible +safely and easily remove these tools afterwards. -See the PDF manual for details about tools required by Sondovač and their manual -installation. For most users it should be sufficient to be guided by the script -to install needed tools automatically. +See the PDF manual for details about tools required by Sondovač and their +manual installation. For most users it should be sufficient to be guided +by the script to install needed tools automatically. First launch of Sondovač @@ -98,7 +100,8 @@ to see basic usage instructions. See README and PDF manual for more information. Examples (see README and PDF manual for explanation of command line parameters) -------------------------------------------------------------------------------- -The basic and most simple usage (running in interactive mode, see README and PDF manual): +The basic and most simple usage (running in interactive mode, see README and +PDF manual): ./sondovac_part_a.sh -i @@ -119,9 +122,9 @@ Modify parameter "-a", otherwise run interactively: ./sondovac_part_a.sh -i -a 300 -Running in non-interactive mode (parameter "-n", see README) - in such case the user -must specify all required input files (parameters "-f", "-c", "-m", "-t" and -"-q"). Moreover, parameter "-y" is modified: +Running in non-interactive mode (parameter "-n", see README) - in such case the +user must specify all required input files (parameters "-f", "-c", "-m", "-t" +and "-q"). Moreover, parameter "-y" is modified: ./sondovac_part_a.sh -n -f input.fa -c referencecp.fasta -m referencemt.fsa \ -t reads1.fastq -q reads2.fastq -y 90 @@ -131,9 +134,9 @@ need to be specified explicitly: ./sondovac_part_a.sh -s 950 -We recommend to launch Sondovač at least for the first time in an interactive mode, -so that the script will verify all requirements and install missing tools when -needed. We then recommend to use non-interactive mode for routine usage. +We recommend to launch Sondovač at least for the first time in an interactive +mode, so that the script will verify all requirements and install missing tools +when needed. We then recommend to use non-interactive mode for routine usage. Help for usage of terminal @@ -156,22 +159,25 @@ first. You can try some of those: Geneious ================================================================================ -Sondovač workflow is divided into three parts (see README and PDF manual for details): +Sondovač workflow is divided into three parts (see README and PDF manual for +details): 1) Raw input data are analyzed by sondovac_part_a.sh. -2) Sequences obtained in part A are assembled by Geneious in a separate step by the user. +2) Sequences obtained in part A are assembled by Geneious in a separate step by + the user. 3) Final probes are produced by sondovac_part_b.sh. -For part (2) of the script the user must have Geneious. We plan to replace it by some free -open-source command line tool in some future release of Sondovač. Visit -http://www.geneious.com/ for download, purchase, installation and usage of -Geneious. +For part (2) of the script the user must have Geneious. We plan to replace it +by some free open-source command line tool in some future release of Sondovač. +Visit http://www.geneious.com/ for download, purchase, installation and usage +of Geneious. Software links (including required versions) ================================================================================ -"X" denotes any subversion of a particular lineage, and "v. >" denotes any version -higher then noted. Generally, any current version should usually be fine. +"X" denotes any subversion of a particular lineage, and "v. >" denotes any +version higher then noted. Generally, any current version should usually be +fine. * Apache Ant 1.9.X - https://ant.apache.org/ * bam2fastq 1.1.0 - http://gsl.hudsonalpha.org/information/software/bam2fastq @@ -198,11 +204,12 @@ Vocabulary * Binary - An application in a form understandable by the computer, but usually not transferable among operating systems and/or hardware platforms. Binaries - in Windows usually have the extension *.exe, in UNIX there is usually no extension. + in Windows usually have the extension *.exe, in UNIX there is usually no + extension. * BASH - "The command line" - fully featured programming scripting language - accessible through the terminal of any UNIX-based operating system (any Linux, - Mac OS X, Solaris, any variant of BSD and more). BASH scripts usually have the - extension *.sh. + accessible through the terminal of any UNIX-based operating system (any + Linux, Mac OS X, Solaris, any variant of BSD and more). BASH scripts usually + have the extension *.sh. * BSD - Group of popular UNIX-based operating systems. See https://en.wikipedia.org/wiki/Berkeley_Software_Distribution. * C - Popular programming language. Source code must be compiled for each @@ -222,10 +229,10 @@ Vocabulary its free community testing platform. See https://getfedora.org/. * GNU - Major project providing free software widely used in many operating systems, see https://gnu.org/. -* Homebrew - Tool primarily for Mac OS X (although there is also a Linux version - available) replacing the practically missing package manager for this system. Can - be used to install plenty of various applications as well as updating tools - already available in Mac OS X. See http://brew.sh/. +* Homebrew - Tool primarily for Mac OS X (although there is also a Linux + version available) replacing the practically missing package manager for this + system. Can be used to install plenty of various applications as well as + updating tools already available in Mac OS X. See http://brew.sh/. * Java - Very popular programming language. It requires Java runtime environment to be installed, but the applications are very well transferable among operating systems. See https://www.java.com/. @@ -243,9 +250,9 @@ Vocabulary http://linuxmint.com/. * Mac OS X - Popular operating system produced by Apple. The system kernel is based on UNIX, see https://www.apple.com/osx/. -* Open-source - Generally, the source code of an application is available together - with the application and can, under certain conditions, be defined in license - modified, redistributed etc. See +* Open-source - Generally, the source code of an application is available + together with the application and can, under certain conditions, be defined + in license modified, redistributed etc. See https://en.wikipedia.org/wiki/Free_and_open-source_software. * openSUSE - Popular Linux distribution, see https://www.opensuse.org/. * Operating system - Basic system running on your computer - typically MS @@ -262,7 +269,8 @@ Vocabulary used only in particular cases. In case of shell applications, parameters are usually given such as "application -X", "application -parameter", "application -Param SomeValue" and so on. See manual for particular - application (e.g. "man application"), in case of Sondovač see README and PDF manual. + application (e.g. "man application"), in case of Sondovač see README and PDF + manual. * PATH - Directories in the computer where the system looks for installed software (in a UNIX-based system you can view it by the command "echo $PATH"). If you need to modify it manually, see the documentation for your @@ -288,9 +296,9 @@ Vocabulary http://distrowatch.com/table.php?distribution=solaris. * Source code - Human-readable code written in any text editor used to develop any application. Applications written in interpreted languages (BASH, Perl, - Python, ...) can be distributed just in form of a source code (nothing else is - required). Other programming languages (C, C++, ...) require compilation to - get fully functional application. + Python, ...) can be distributed just in form of a source code (nothing else + is required). Other programming languages (C, C++, ...) require compilation + to get fully functional application. * SUSE Linux Enterprise (SLE) - Large Linux company providing mainly solutions for big companies. See https://www.suse.com/. * Terminal - See "Shell". @@ -305,5 +313,5 @@ Vocabulary problems, no one will probably help you. Moreover, using old versions of software can be a security risk because of security issues fixed in newer versions. -* Variable - Named value storing various information, one of the basic part of any - programming language, application, operating system. +* Variable - Named value storing various information, one of the basic part of + any programming language, application, operating system. diff --git a/LICENSE b/LICENSE index 48b4bc7..78e1e66 100644 --- a/LICENSE +++ b/LICENSE @@ -3,12 +3,12 @@ Sondovač 0.99 RC Licenses Sondovač is a script to create orthologous low-copy nuclear probes from transcriptome and genome skim data for target enrichment. -The set of BASH scripts Sondovač is licensed under GNU General Public License -version 3. List of licenses of included software is in following table (see -full texts below). License of BLAT does not allow redistribution, so that this -software is not included and the software is downloaded on the fly. Script is -also using software included in GNU core utilities (basic tools available in any -UNIX-based system), see https://www.gnu.org/software/coreutils/ for details. +The set of BASH scripts Sondovač is licensed under GNU General Public License +version 3. List of licenses of included software is in following table (see +full texts below). License of BLAT does not allow redistribution, so that this +software is not included and the software is downloaded on the fly. Script is +also using software included in GNU core utilities (basic tools available in +any UNIX-based system), see https://www.gnu.org/software/coreutils/ for details. Software License License details -------------------------------------------------------------------------------- diff --git a/README b/README index 93a9476..bef2a36 100644 --- a/README +++ b/README @@ -1,8 +1,9 @@ Sondovač 0.99 RC Basic help -Sondovač (English pronunciation is "Sondovach") is a script to create -orthologous low-copy nuclear probes from transcriptome and genome skim data for -target enrichment. +Sondovač (English pronunciation is "Sondovach". The word is a Czech neologism +meaning something like "The Prober" or "The Probe Maker".) is a script to +create orthologous low-copy nuclear probes from transcriptome and genome skim +data for target enrichment. Script summary @@ -173,8 +174,8 @@ process its output manually by Geneious according to the instructions given below. The output of Geneious is then processed by sondovac_part_b.sh, which produces the final probe set. Geneious was tested with versions 6, 7 and 8. -Import the output file of part A of the script (sondovac_part_a.sh): go to menu -File | Import | From File... This file is named as: +Import the output file of part A of the script (sondovac_part_a.sh): +go to menu File | Import | From File... This file is named as: *_blat_unique_transcripts_versus_genome_skim_data-no_missing_fin.fsa Select the file and go to menu Tools | Align / Assemble | De Novo Assemble. @@ -274,8 +275,7 @@ Input files: Optional parameters: See chapter "Pipeline" for steps referred here. If those parameters are not provided, the default values are used, and it is -not - possible to change them any time later (not even in interactive mode). + not possible to change them any time later (not even in interactive mode). -a ### Maximum overlap length expected in approximately ≥90% of read pairs (parameter -M of FLASH, see its manual for details). FLASH can not @@ -432,7 +432,8 @@ Script sondovac_part_b.sh creates the following files: 3) *_target_enrichment_probe_sequences.fasta - Probes in FASTA. 4) *_possible_cp_dna_genes_in_probe_set.pslx - In case of any BLAT hits, the user might needs to manually remove these plastid probe sequences from - *_target_enrichment_probe_sequences.fasta (the previous script outfile); the remaining ones are the final probe sequences in FASTA. + *_target_enrichment_probe_sequences.fasta (the previous script outfile); + the remaining ones are the final probe sequences in FASTA. An asterisk (*) denotes the beginning of the output files' names specified by the user with parameter "-o". If user does not select a custom name, default diff --git a/manual/sondovac_manual.pdf b/manual/sondovac_manual.pdf index c363546..7b90bbb 100644 Binary files a/manual/sondovac_manual.pdf and b/manual/sondovac_manual.pdf differ diff --git a/manual/sondovac_manual.tex b/manual/sondovac_manual.tex index c25b8c4..dc77abb 100644 --- a/manual/sondovac_manual.tex +++ b/manual/sondovac_manual.tex @@ -198,7 +198,7 @@ \subsubsection{openSUSE and SUSE Linux Enterprise (SLE)} \subsubsection{Debian, Ubuntu, Linux Mint and derivatives} -Debian (\href{https://www.debian.org/}{https://www.debian.org/}), Linux Mint (\href{http://linuxmint.com/}{http://linuxmint.com/}), Ubuntu (\href{http://www.ubuntu.com/}{http://www\\.ubuntu.com/}) and all derived distributions\footnote{See \href{http://distrowatch.com/search.php?basedon=Debian}{http://distrowatch.com/search.php?basedon=Debian} and \href{http://distrowatch.com/search.php?basedon=Ubuntu}{http://distrowatch.com/search.php?basedon=\\Ubuntu} for complete lists.} like Kubuntu (\href{http://www.kubuntu.com/}{http://www.kubuntu.com/}) use for package management commands \texttt{apt-get} (basic) and \texttt{aptitude} (front-end for \texttt{apt-get}, recommended, not available by default in every DEB distribution). There are more tools available\footnote{See \href{https://wiki.debian.org/PackageManagement}{https://wiki.debian.org/PackageManagement} for list of tools and \href{https://www.debian.org/doc/manuals/debian-reference/ch02.en.html}{https://www.debian.org/doc/manuals/\\debian-reference/ch02.en.html} for exhaustive documentation. A shorter introduction is available at \href{https://help.ubuntu.com/community/AptGet/Howto}{https://help.ubu\\ntu.com/community/AptGet/Howto} and \href{http://ubuntuguide.org/wiki/Ubuntu_Trusty_Packages_and_Repositories}{http://ubuntuguide.org/wiki/Ubuntu$\_$Trusty$\_$Packages$\_$and$\_$Reposit\\ories}. Ubuntu-specific information at \href{https://help.ubuntu.com/stable/ubuntu-help/addremove.html}{https://help.ubuntu.com/stable/ubuntu-help/addremove.html}.}, we will describe only the basic usage needed for our purpose. The script will check if all required software packages are installed, and if not, it will install them. You can do it also manually: +Debian (\href{https://www.debian.org/}{https://www.debian.org/}), Linux Mint (\href{http://linuxmint.com/}{http://linuxmint.com/}), Ubuntu (\href{http://www.ubuntu.com/}{http://www\\.ubuntu.com/}) and all derived distributions\footnote{See \href{http://distrowatch.com/search.php?basedon=Debian}{http://distrowatch.com/search.php?basedon=Debian} and \href{http://distrowatch.com/search.php?basedon=Ubuntu}{http://distrowatch.com/search.php?basedon=\\Ubuntu} for complete lists.} like Kubuntu (\href{http://www.kubuntu.com/}{http://www.kubuntu.com/}) use for package management commands \texttt{apt-get} (basic) and \texttt{aptitude} (front-end for \texttt{apt-get}, recommended, not available by default in every DEB based distribution). There are more tools available\footnote{See \href{https://wiki.debian.org/PackageManagement}{https://wiki.debian.org/PackageManagement} for list of tools and \href{https://www.debian.org/doc/manuals/debian-reference/ch02.en.html}{https://www.debian.org/doc/manuals/\\debian-reference/ch02.en.html} for exhaustive documentation. A shorter introduction is available at \href{https://help.ubuntu.com/community/AptGet/Howto}{https://help.\\ubuntu.com/community/AptGet/Howto} and \href{http://ubuntuguide.org/wiki/Ubuntu_Trusty_Packages_and_Repositories}{http://ubuntuguide.org/wiki/Ubuntu$\_$Trusty$\_$Packages$\_$and$\_$Re\\positories}. Ubuntu-specific information at \href{https://help.ubuntu.com/stable/ubuntu-help/addremove.html}{https://help.ubuntu.com/stable/ubuntu-help/addremove.html}.}, we will describe only the basic usage needed for our purpose. The script will check if all required software packages are installed, and if not, it will install them. You can do it also manually: \begin{bashcode} # Verify installation of basic tools (they are installed in 99.9%): @@ -501,8 +501,8 @@ \subsection{The PATH variable} /home/$USER/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/sbin:/usr/sbin # Adding new directory to $PATH export PATH=$PATH:/some/new/directory - # Do not do it in the following way - it would overwrite $PATH, and there would be only - # the new directory (not the original content)! + # Do not do it in the following way - it would overwrite $PATH, and there would + # be only the new directory (not the original content)! export PATH=/some/new/directory # Wrong! Old $PATH is missing and will be lost! # Removing possible duplicate entries in $PATH with regular expressions and awk export PATH="$(echo "$PATH" | awk 'BEGIN{RS=":";}{sub(sprintf("%c$",10),""); \ @@ -645,7 +645,7 @@ \subsubsection{Optional parameters} \item DEFAULT: 85 (highly recommended) \item OPTIONS: Integer ranging from 70 to 100 \end{itemize} -\item[\texttt{-g \#\#\#\#}] Choice of transcript or genome skim sequences for further processing. +\item[\texttt{-g}] Choice of transcript or genome skim sequences for further processing. \begin{itemize} \item Step 6.1 of Sondovač, \texttt{sondovac$\_$part$\_$a.sh}. \item Depending on the phylogenetic depth that should be obtained, the probe sequences need to be designed from either the transcript or genome skim sequences, or it might not matter (if the taxa, from which the transcriptome and genome skim data were generated, are closely related). @@ -741,8 +741,8 @@ \subsection{Input and output files} \begin{enumerate} \item \texttt{*$\_$prelim$\_$probe$\_$seq.fasta} -- Preliminary probe sequences. \item \texttt{*$\_$similarity$\_$test.fasta} -- Contigs that comprise exons $\geq$ bait length and have a certain total locus length. - \item \texttt{*$\_$target$\_$enrichment$\_$probe$\_$sequences.fasta} -- Probes in FASTA. - \item \textbf{\texttt{*$\_$possible$\_$cp$\_$dna$\_$gene$\_$in$\_$probe$\_$set.pslx} -- In case of any BLAT hits, the user needs to manually remove these plastid probe sequences from {*$\_$target$\_$enrichment$\_$probe$\_$sequences.fasta}; the remaining ones are the final probe sequences in FASTA.} + \item \texttt{*$\_$target$\_$enrichment$\_$probe$\_$sequences.fasta} -- \textbf{Final probes in FASTA.} + \item \texttt{*$\_$possible$\_$cp$\_$dna$\_$gene$\_$in$\_$probe$\_$set.pslx} -- In case of any BLAT hits, the user needs to manually remove these plastid probe sequences from \texttt{*$\_$target$\_$enrichment$\_$probe$\_$se\-quences.fasta}; the remaining ones are the final probe sequences in FASTA. \end{enumerate} An asterisk (*) denotes the beginning of the output files' names specified by the user with parameter \texttt{-o}. If the user does not select a custom name, default value (\texttt{output}) will be used. By default, output files are created in the same directory from which Sondovač was launched. Output files can be saved in a custom directory by specifying an output directory together with parameter \texttt{-o}: @@ -863,13 +863,14 @@ \section{Changelog} List of changes in released versions of Sondovač. -\subsection{Version 0.99 release candidate released 2015-12-07} +\subsection{Version 0.99 release candidate released 2015-12-08} \begin{itemize} \item Fixed error with some input files for part B. \item Finished coloration of command-line user interface. - \item Added possibility minimal exon length of the loci. + \item Added possibility to set minimal exon length of the loci. \item Various fixes and UI enhancements. + \item Improved documentation. \end{itemize} \subsection{Version 0.95 beta released 2015-11-27} diff --git a/sondovac_functions b/sondovac_functions index e8c3849..115e4bb 100644 --- a/sondovac_functions +++ b/sondovac_functions @@ -2,7 +2,7 @@ # Version of the script SCRIPTVERSION=0.99 -RELEASEDATE=2015-12-07 +RELEASEDATE=2015-12-08 # Web page of the script WEB="https://github.com/V-Z/sondovac/" @@ -998,7 +998,7 @@ function checkblat { echo "Type \"${REDF}H${NORM}\" ${CYAF}for installation using Homebrew${NORM} (only for Mac OS X, recommended)." echo " See \"${REDF}brew info homebrew/science/blat${NORM}\" for more details." echo "Type \"${REDF}M${NORM}\" ${CYAF}for manual installation - script will exit${NORM} and you will have to" - echo " install \"BLAT\" yourselves. Check ${REDF}http://genome.ucsc.edu/FAQ/FAQblat.html${NORM}" + echo " install \"BLAT\" yourself. Check ${REDF}http://genome.ucsc.edu/FAQ/FAQblat.html${NORM}" echo " for more information." read BLAT while : @@ -1021,7 +1021,7 @@ function checkblat { echo echo "${REDF}${BOLD}Error!${NORM} ${CYAF}Download of \"${REDF}BLAT${NORM}\" failed.${NORM} Please, go to" echo "${REDF}http://hgdownload.cse.ucsc.edu/admin/exe/macOSX.x86_64/blat/${NORM}" - echo "and download blat binary yourselves." + echo "and download blat binary yourself." echo exit 1 } @@ -1036,7 +1036,7 @@ function checkblat { echo echo "${REDF}${BOLD}Error!${NORM} ${CYAF}Download of \"${REDF}BLAT${NORM}\" failed.${NORM} Please, go to" echo "${REDF}http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/blat/${NORM}" - echo "and download blat binary yourselves." + echo "and download blat binary yourself." echo exit 1 } diff --git a/sondovac_part_a.sh b/sondovac_part_a.sh index db90e33..6bfb4f5 100755 --- a/sondovac_part_a.sh +++ b/sondovac_part_a.sh @@ -63,13 +63,13 @@ while getopts "hvulrpeo:inf:c:m:t:q:a:y:s:g" START; do echo -e "\t${REDF}-f${NORM}\t${CYAF}Transcriptome input file${NORM} in FASTA format." echo -e "\t${REDF}-c${NORM}\t${CYAF}Plastome reference sequence${NORM} input file in FASTA format." echo -e "\t${REDF}-m${NORM}\t${CYAF}Mitochondriome reference sequence${NORM} input file in FASTA format." - echo -e "\t\t This file is optional. In interactive mode you will each time be" - echo -e "\t\t asked if you wish to use it." + echo -e "\t\t This file is optional. In interactive mode you will each time" + echo -e "\t\t be asked if you wish to use it." echo -e "\t${REDF}-t${NORM}\t${CYAF}Paired-end genome skim input file${NORM} in FASTQ format (first file)." echo -e "\t${REDF}-q${NORM}\t${CYAF}Paired-end genome skim input file${NORM} in FASTQ format (second file)." echo echo -e "\tOther optional arguments (if not provided, default values are used):" - echo -e "\t${REDF}-a${NORM}\t${CYAF}Maximum overlap length expected in approximately 90% of read pairs" + echo -e "\t${REDF}-a${NORM}\t${CYAF}Maximum overlap length expected in approximately 90% of read" echo -e "\t\t pairs${NORM} (parameter \"-M\" of FLASH, see its manual for details)." echo -e "\t\tDefault value: 65 (integer ranging from 10 to 300)" echo -e "\t${REDF}-y${NORM}\t${CYAF}Sequence similarity between unique transcripts and the filtered," @@ -1126,7 +1126,7 @@ FLASHOUT="${OUTPUTFILENAME%.*}_combined_reads_no_cp_no_mt_reads" BLATOUTFIN="${OUTPUTFILENAME%.*}_blat_unique_transcripts_versus_genome_skim_data.pslx" # Matching sequences in FASTA BLATOUTFIN2="${OUTPUTFILENAME%.*}_blat_unique_transcripts_versus_genome_skim_data.fasta" -# FASTA converted into TAB - temporary file - will be deleted +# FASTA converted into TSV - temporary file - will be deleted TAB="${OUTPUTFILENAME%.*}_final.tab" # Number of times each transcript hit a genome skim read - will be deleted TABLIST="${OUTPUTFILENAME%.*}_transcript_hits.txt" diff --git a/sondovac_part_b.sh b/sondovac_part_b.sh index 569e84f..11c363b 100755 --- a/sondovac_part_b.sh +++ b/sondovac_part_b.sh @@ -13,7 +13,7 @@ source $SCRIPTDIR/sondovac_functions || { echo "Fatal error!" echo "Unable to load file \"sondovac_functions\" with required functions!" echo "It must be in same directory as \"$0\"" - echo "Check it and if needed download again whole script from" + echo "Check it and, if needed, download again whole script from" echo "https://github.com/V-Z/sondovac/" echo exit 1 @@ -66,15 +66,19 @@ while getopts "hvulrpeo:inc:x:z:b:d:y:k:" START; do echo -e "\t\t values 80, 100 or 120)." echo -e "\t${REDF}-d${NORM}\t${CYAF}Sequence similarity between the developed probe sequences${NORM}" echo -e "\t\t (parameter \"-c\" of cd-hit-est, see its manual for details)." - echo -e "\t\tDefault value: 0.9 (use decimal number ranging from 0.85 to 0.95)." + echo -e "\t\tDefault value: 0.9 (use decimal number ranging from 0.85 to" + echo -e "\t\t 0.95)." echo -e "\t${REDF}-y${NORM}\t${CYAF}Sequence similarity between the probes and plastome reference${NORM}" echo -e "\t\t searching for possible plastid genes in probe set (parameter" echo -e "\t\t \"-minIdentity\" of BLAT, see its manual for details)." echo -e "\t\tDefault value: 90 (integer ranging from 85 to 95)." - echo -e "\t${REDF}-k${NORM}\t${CYAF}Minimum total locus length.${NORM}." + echo -e "\t${REDF}-k${NORM}\t${CYAF}Minimum total locus length.${NORM}" echo -e "\t\tDefault value: 600. Allowed values are 600, 720, 840," - echo -e "\t\t 960, 1080 and 1200. When running in interactive mode," - echo -e "\t\t the user will be asked which value to use. A table summarizing the total number of LCN loci, which will be the result of the probe design for all minimum total locus lenghts that the user can select, will be displayed to facilitate this choice." + echo -e "\t\t 960, 1080 and 1200. When running in interactive mode, the user" + echo -e "\t\t will be asked which value to use. A table summarizing the total" + echo -e "\t\t number of LCN loci, which will be the result of the probe design" + echo -e "\t\t for all minimum total locus lenghts that the user can select," + echo -e "\t\t will be displayed to facilitate this choice." echo -e "\t${BOLD}WARNING!${NORM} If parameters ${BOLD}-b${NORM}, ${BOLD}-d${NORM} or ${BOLD}-y${NORM} are not provided, default values" echo -e "\t are taken, and it is not possible to change them later (not even in" echo -e "\t interactive mode)."