Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3'UTR length #121

Open
Choopanian-Peyman opened this issue Dec 27, 2022 · 4 comments
Open

3'UTR length #121

Choopanian-Peyman opened this issue Dec 27, 2022 · 4 comments

Comments

@Choopanian-Peyman
Copy link

Dear Tobias,

When I checked the counts file, I saw that the length of a UTR reported by slamseq is different with its length reported on UCSC.
For example, see the A1BG in my count file and UCSC;

My count file:
Chromosome Start End Name Length Strand ConversionRate
1 19 58346849 58347021 A1BG 172 - 0.038930397

UCSC (http://genome.cse.ucsc.edu/cgi-bin/hgGene?org=Human&hgg_chrom=none&hgg_type=knownGene&hgg_gene=uc002qsd.5)
Region Fold Energy Bases Energy/Base
3' UTR -638.20 1839 -0.347

I would appreciate good advice.
Best,
Peyman

@t-neumann
Copy link
Owner

Hi - 3' end sequencing only amplifies the last ~250bp of each transcript. So our annotations does a couple of processing steps to tailor the annotation to 3' end sequencing. The 3' UTRs are actually those counting windows.

@Choopanian-Peyman
Copy link
Author

Tnx for your reply,

But there are UTRs in my count file whose length is more than 250. Even the length of some of them is equal to what is reported in UCSC.

Ex;
Chromosome Start End Name Length Strand ConversionRate
1 70008 71585 OR4F5 1577 + 0.0

Best

@Choopanian-Peyman Choopanian-Peyman changed the title 3'UTR names 3'UTR length Dec 28, 2022
@Choopanian-Peyman
Copy link
Author

Hi Tobias,

Is there any annotation file including length of 3'UTRs?
Apparently I can not use the length column in the count file, since it has several lengths with the same name of UTR.

@t-neumann
Copy link
Owner

Hi @Choopanian-Peyman - which annotation file do you use?
It could be there are multiple counting windows (one for each isoform) reported thats why you will have several entries per gene.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants