Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more chain peptides from Uniprot xml protein processing info #441

Open
blfrey opened this issue Nov 30, 2018 · 0 comments
Open

Add more chain peptides from Uniprot xml protein processing info #441

blfrey opened this issue Nov 30, 2018 · 0 comments

Comments

@blfrey
Copy link

blfrey commented Nov 30, 2018

Discovered this issue in MetaMorpheus top-down for bovine cationic trypsin (P00760), which has a signal peptide (1-17), a propeptide (18-23), and a chain peptide (24-246). Actual trypsinogen from Sigma is the chain 18-246 (i.e. 2 of the 3 segments), but this is not parsed as an option by mzlib for Metamorpheus top-down to use. It seems that mzlib allows the entire length (1-246) and the three smaller segments individually, but not the combination of 2 of the segments.

Note that this also could be relevant to bottom-up b/c imagine an enzymatic digest of trypsinogen (18-246), which could have an N-terminal peptide that spans 18-32. That probably wouldn't be assigned currently b/c mzlib isn't allowing this protein sequence to start at position 18.

In this case of P00760, mzlib should parse it to allow:
1-246, full-length
1-17, signal peptide
18-23, pro-peptide
24-246, chain peptide
1-23, signal+propeptide
18-246, propeptide+chain

Some proteins with even more segments (e.g Glucagon) will have many more combinations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant