-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to encode non-canonical amino acids into search? #90
Comments
Hey, Follow the rules for site-specific modifications here https://github.com/Rappsilber-Laboratory/xisearch#modification-settings In short, put in your fasta something like this for a variable modification. Remove parenthesis in the fasta for having mod as a fixed modification instead. Mod names are arbitrary but have to be lowercase.
and then in the .config, define the modification with the deltamass relative to the unmodified amino acid.
|
Not sure how I missed that, thanks for your help! |
if you want to have it site specific you can also encode it in the fasta-file |
Can one then make use of this specific modified amino acid in other setting lines? For example, would the following lines work?
Additionally, can one create a fixed modification on this modified amino acid? Would this line work?
|
Sorry I don't understand. Is the non canonical amino acid also a crosslinker, or just a different amino acid? |
by default crosslinker that crosslink to D will also crosslink to Dmod as far as I understand. |
The question is, in general, if I define a modified amino acid in the fasta sequence, do I have to add this specific modified amino acid to the settings of an enzyme, crosslinker and fixed/variable modifications? Can I define one specific protein that is fully 15N-labelled in the fasta file while keeping others as normal protein sequences? for instance
|
there is no general answer to this question, is what I am trying to reply- it kind of depends what you want to do. Labelling is typically not defined as a modification but using the label word https://github.com/Rappsilber-Laboratory/XiSearch?tab=readme-ov-file#isotope-labelling The label word will search every amino acid as heavy or light version of itself (or whatever custom deltamass you give with the list). So my suggestion would be
If instead you really want to define only a single protein as 100% labelled, I think you are going about it the right way. The crosslinker will react with the modified amino acid, but if you use a protease that cuts at that amino acid i don't know @lutzfischer may clarify this also for losses. For modifications defined in fasta, you should use the known modifications, not fixed (again see near the end of https://github.com/Rappsilber-Laboratory/XiSearch?tab=readme-ov-file#modification-settings )
for a fasta like ACKASphAK No brackets in the sequence for a fixed modification. as an aside, I suggest using the DELTAMASS and SYMBOLEXT nomenclature to use unimod modification masses rather than total masses https://github.com/Rappsilber-Laboratory/XiSearch?tab=readme-ov-file#modification-settings |
I see now with the fixed modification on a site specific modified AA. Again i don't know sorry. I will test because I am also curious. With label it works |
Is it permitted to define multiple modified amino acids on one line? Is it necessary to list each modified amino acid on a separate line? modification:known::SYMBOLEXT:ph;MODIFIED:S,T;DELTAMASS:79.966331 or modification:known::SYMBOLEXT:ph;MODIFIED:S;DELTAMASS:79.966331 |
Both should work but you should not use "X" for any amino acid or "nterm" for protein N terminus, those go on separate lines. |
One note ahead: You can use any modification in other lines - but you have to define the modifications first. Xi parses the config file strictly linear - i.e. anything self-defined that you use somewhere has to be defined above of that.
That is only true for label - as these are assumed to not change the relevant chemical properties. But modifications need to be mentioned in enzyme and crosslinker defintions.
Yes that would be the case.
Not as a labeling schema - the closest you can do is either define variable modifications in the fasta for each residue - but that will probably result in an exploding search space - or define the protein twice in the fasta file - with and without (fixed) modification. in both cases you should define the modified residue as known modification and add the right ones to the specificities of crosslinker and enzyme. |
Hi Lutz, Would settings like this work? Can the fixed and variable modifications recongnise the declared known modification? modification:known::SYMBOLEXT:a;MODIFIED:C;DELTAMASS:1 modification:fixed::SYMBOL:Ccm;MODIFIED:C;MASS:160.03065 modification:variable::SYMBOL:Mox;MODIFIED:M;MASS:147.035395 |
I tried defining the protein twice in the fasta file, declaring the modification:known and adding the right ones to the specificities of crosslinker and protease. Unfortunatly, XiSearch didn't identify the modified protein at all. |
Applying the heavy label scheme works, although the scheme applies to every protein in the fasta file. Does XiSearch recognise that the following masses should also become higher in the heavy labelled proteins? modification:variable::SYMBOL:Mox;MODIFIED:M;MASS:147.035395 |
yes but in that case you could define it a bit more compact as:
The resulting fixed modification would be Ccm and Cacm as well (symbolext is cumulative) and variable Mox and Mbox).
Not sure why this should fail. Can you send me the config/Fasta (lutz dot fischer tu-berlin dot de)? Then I can have a look if I understand what went wrong here.
It should create a labelled version of these as well. I.e. if the label schema is n15 you should see Ccmn15 as a modification. |
I found "Mox5" and "Ccm5" on the identified peptide sequences, but they were detected only on the unlabelled peptides, although they were expected to sit on the 15N-labelled peptides. It would be really great if this could be further developed. 15N could be very useful in some applications. |
Hello,
A collaborator asked me how to encode a non-canonical amino acid into the search. This same amino acid would also be the one that crosslinks. Let's call the amino acid 'X', with a mass difference to the standard amino acid (e.g., D)of 50 Da. Is it possible to feed XiSearch a fasta with the mass difference of the standard amino acid to the newly incorporated one? This would just be for one protein, not for the whole proteome. For example:
Sequence: ASDFK, Modified sequence: ASXFK
Can I upload a fasta with: ASD(+50)FK? Is there a format I need to follow? Francis seemed to recall being able to hard-code acetylation sites on XiSearch, but he can't remember how he did this.
Otherwise, would I just search all the 'D' residues to have a modified mass of 50?
Thanks,
Anthony
The text was updated successfully, but these errors were encountered: