Nicole's moPepGen adventure #746
Replies: 9 comments 20 replies
-
InstallationI started by opening up my favorite miniconda env and running This did not work with the following error:
I then tried to just clone the repo, which worked with no issues: So I have all the code now but I don't know how to actually run it from command line - it's not officially installed and I don't know what binary I'm supposed to be executing. |
Beta Was this translation helpful? Give feedback.
-
Reference IndexThe instructions are very clear that the first step is to generate an index of reference files for moPepGen. It's not clear whether this reference set needs to be built from the same set of reference files that was used to call your genomic inputs. If that is the case, then it might be important to note that DNA alignment and RNA processing was not necessarily performed on the same reference set. In my case DNA alignment was on GRCh38-BI-20160721 and RNA was on GRCh38-EBI-GENCODE36. I don't actually know the difference between the BWA indexed reference and the standard GENCODE36. For now, pushing on with the GENCODE reference set:
Initially tried this in an F2 node, got a memory error, fair enough. Might be nice to give an idea of the mem requirements.
Tried again in an f16, got a little farther, then it crashed without much explanation. Note the warning from Biopython:
Next I ramped it up to an F72:
Success! Log:
Err:
Outputs:
tl;dr
|
Beta Was this translation helpful? Give feedback.
-
Question re input file formatFor all moPepGen functions, can |
Beta Was this translation helpful? Give feedback.
-
Parsing Part 1.1 - Fusion from STAR-FusionI'm starting the GVF file creation. I wanted to start with SNVs but turns out our annotation pipeline doesn't actually have VEP, despite documentation saying otherwise LOL. So starting with something easier for now: Star fusion. No problems. Code for Star Fusion parsing for all available samples:
Sample from log:
Sample from err:
^Biopython Deprecation warning again. |
Beta Was this translation helpful? Give feedback.
-
Parsing Part 1.2 - Fusion from Fusion CatcherThis ran very smoothly, no issues. Parsing code:
Log:
Error (being addressed in #752) :
|
Beta Was this translation helpful? Give feedback.
-
Parsing Part 3 - Alternative SplicingTesting functionality of Code:
Logs:
Error, same as before:
|
Beta Was this translation helpful? Give feedback.
-
Parsing Part 4: SNVs + VEPAfter many tribulations with VEP, I think I have finally achieved a VEP parse. VEP AnnotationLessons learned
Code:
VEP Parsing:No issues other than ye old Biopython warning Code:
Logs:
|
Beta Was this translation helpful? Give feedback.
-
Question about GTF reference files: So I finished the VEP annotation and am moving onto RNA Editing Sites annotation, and both of these require a reference annotation GTF. I just realized that the GTF I used for VEP annotation had to be sorted and bgzipped. However the GTF used in MoPep indexing was not a sorted or bgzipped version:
Is that going to be an issue? Does the indexed GTF need to be completely identical, even down to the sorting of the contents? |
Beta Was this translation helpful? Give feedback.
-
In the Reference Data section, the last line in the paragraph has a hyperlink: "See here for more details. " The link in "here" is broken, leads to 404 not found. |
Beta Was this translation helpful? Give feedback.
-
I'll keep track of my progress with test-running moPepGen here.
Starting inputs: Roni's radioresistance cell line dataset /hot/project/disease/ProstateTumor/PRAD-000096-RadioResDU145Molecular/
Beta Was this translation helpful? Give feedback.
All reactions