CoronaSpades - lacking documentation #925
-
Description of bugI cannot find enough information on how coronaspades makes use of the hmms to assemble viruses (from the users perspective, not the perspective of how it actually works) Suppose I am not looking for just one family of viruses, but many. If I provide hmms for many different virus families simultaneously, will this make a mess? i.e., will it attempt to find scaffolds that have as many of those proteins as possible in them, even though they are from different families? I see that by default coronaspades keeps all of the coronavirus hmms in one file. Is this a convenience, or does it imply that I should separate my virus protein hmms by family? spades.logThis is not relevant params.txtthis is not relevant SPAdes versionSpades 3.15.4 Operating SystemScientific linux Python VersionNo response Method of SPAdes installationconda No errors reported in spades.log
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Hello The description of how coronaSPAdes / rnaviralSPAdes uses HMMs is described in paper: https://pubmed.ncbi.nlm.nih.gov/34406356/ From the user perspective we suggest to split HMM set by family (similar to what is done for Coronaviridae HMMs). You can mix HMMs if you can be sure there are no cross-species hits. |
Beta Was this translation helpful? Give feedback.
-
Great! Does that mean my specified directory can contain one hmm file per virus family, and each file will be used to separately guide the assembly, or do I have to run the code separately for each hmm/family? Thanks, |
Beta Was this translation helpful? Give feedback.
Hello
The description of how coronaSPAdes / rnaviralSPAdes uses HMMs is described in paper: https://pubmed.ncbi.nlm.nih.gov/34406356/
From the user perspective we suggest to split HMM set by family (similar to what is done for Coronaviridae HMMs). You can mix HMMs if you can be sure there are no cross-species hits.