question about interpretation of `pred` score #9

m-jahn · 2023-11-10T13:45:59Z

Hi DeepRibo developers,

thanks for making this package, it is very convenient to work with.
I'm currently testing the predictions from DeepRibo for a bacterium where we have new RiboSeq data.
I am using mainly standard settings with the pretrained model, and I make use of the possibility to pass annotation data for S-curve estimation. From what I see, the prediction works quite well, but I have some questions about the score (and its distribution) that deepRibo produces.

My filtered output looks roughly like this:

   seqnames start   end width strand intergenic   rpk rpk_elo  pred SS_pred_rank
   <chr>    <dbl> <dbl> <dbl> <chr>  <lgl>      <dbl>   <dbl> <dbl>        <dbl>
 1 NC_0012…  1742  2878  1136 +      FALSE       741.    6.72  3.71           73
 2 NC_0012…  1788  1844    56 +      FALSE      1529.   18.4  -6.84         2933
 3 NC_0012…  2025  2060    35 +      FALSE       337.    3.21 -6.78         2885
 4 NC_0012…  2178  2228    50 +      FALSE       570.    3.94 -9.48         4448
   ...

I have already removed the inferior ORFs per stop codon. The remaining list of probable ORFs predicted by deepRibo exceeds by far the number of annotated genes (many false positives). Using a set of known, real ORFs for benchmarking, I found that almost all correctly predicted ORFs have a positive pred score (50 out of 70), while thousands of "false positives" have a negative score. Yet the pred score is not used to label high confidence ORFs, only to rank ORFs.

My question is therefore, if this behavior is expected, and if the pred score can be used as a threshold to identify high confidence ORFs?

The text was updated successfully, but these errors were encountered:

m-jahn · 2024-02-01T13:20:12Z

Any info or update on this? I'm still interested.
We are currently making a comparison of different small ORF scores and to use your package, I'd need to know how to interpret the scores.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about interpretation of `pred` score #9

question about interpretation of `pred` score #9

m-jahn commented Nov 10, 2023

m-jahn commented Feb 1, 2024

question about interpretation of pred score #9

question about interpretation of pred score #9

Comments

m-jahn commented Nov 10, 2023

m-jahn commented Feb 1, 2024

question about interpretation of `pred` score #9

question about interpretation of `pred` score #9