You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks for making this package, it is very convenient to work with.
I'm currently testing the predictions from DeepRibo for a bacterium where we have new RiboSeq data.
I am using mainly standard settings with the pretrained model, and I make use of the possibility to pass annotation data for S-curve estimation. From what I see, the prediction works quite well, but I have some questions about the score (and its distribution) that deepRibo produces.
I have already removed the inferior ORFs per stop codon. The remaining list of probable ORFs predicted by deepRibo exceeds by far the number of annotated genes (many false positives). Using a set of known, real ORFs for benchmarking, I found that almost all correctly predicted ORFs have a positive pred score (50 out of 70), while thousands of "false positives" have a negative score. Yet the pred score is not used to label high confidence ORFs, only to rank ORFs.
My question is therefore, if this behavior is expected, and if the pred score can be used as a threshold to identify high confidence ORFs?
The text was updated successfully, but these errors were encountered:
Any info or update on this? I'm still interested.
We are currently making a comparison of different small ORF scores and to use your package, I'd need to know how to interpret the scores.
Hi DeepRibo developers,
thanks for making this package, it is very convenient to work with.
I'm currently testing the predictions from DeepRibo for a bacterium where we have new RiboSeq data.
I am using mainly standard settings with the pretrained model, and I make use of the possibility to pass annotation data for S-curve estimation. From what I see, the prediction works quite well, but I have some questions about the score (and its distribution) that
deepRibo
produces.My filtered output looks roughly like this:
I have already removed the inferior ORFs per stop codon. The remaining list of probable ORFs predicted by
deepRibo
exceeds by far the number of annotated genes (many false positives). Using a set of known, real ORFs for benchmarking, I found that almost all correctly predicted ORFs have a positivepred
score (50 out of 70), while thousands of "false positives" have a negative score. Yet thepred
score is not used to label high confidence ORFs, only to rank ORFs.My question is therefore, if this behavior is expected, and if the
pred
score can be used as a threshold to identify high confidence ORFs?The text was updated successfully, but these errors were encountered: