Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(protein) termini patching #374

Merged
merged 19 commits into from
Jul 26, 2024
Merged

Conversation

csbrasnett
Copy link
Collaborator

with the -ter flag, add modifications read from the force field to the n and c termini

csbrasnett and others added 4 commits June 14, 2024 11:41
- with the -ter flag, add modifications read from the force field to the n and c termini
- updated m3 command to include -ter
- new m3/prot.itp with correct parameters
Copy link
Member

@fgrunewald fgrunewald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're on the right track here. What do you think about making this a general modifications processor. I would parese a stirng of potential modifcations with the same syntax as used for splitting ligands etc. like so:

-mods <resname>#<resid>:modification_name

Of course we are lazy people so we keep the terminal patch for proteins. But I suggest that the ter flag takes to potential values such that you can toggle between charged and neutral.

In your apply_modifications processor, you want to loop over the list of modifications taken from the input and call parse_residue_spec(resspec) from annotate_ligands. This returns a dict of resids and resnames. So then you can go and loop over the meta-molecule to do the modifications. You may want to check of both resid and resname are given that they match.

For the protein terminal case you basically keep doing what you're doing, but simply take the highest/lowest resid.

.gitignore Show resolved Hide resolved
bin/polyply Outdated Show resolved Hide resolved
polyply/data/martini3/aminoacids.ff Show resolved Hide resolved
polyply/data/martini3/aminoacids.ff Outdated Show resolved Hide resolved
polyply/src/apply_links.py Show resolved Hide resolved
polyply/src/apply_termini.py Outdated Show resolved Hide resolved
polyply/src/apply_termini.py Outdated Show resolved Hide resolved
polyply/src/apply_termini.py Outdated Show resolved Hide resolved
polyply/src/apply_termini.py Outdated Show resolved Hide resolved
polyply/src/gen_itp.py Outdated Show resolved Hide resolved
Copy link
Member

@fgrunewald fgrunewald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csbrasnett looks pretty good; I've some detail comments which I think should be easy to fix

bin/polyply Outdated Show resolved Hide resolved
polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
polyply/src/gen_itp.py Outdated Show resolved Hide resolved
polyply/tests/test_apply_modifications.py Outdated Show resolved Hide resolved
polyply/tests/test_data/ff/modifications.ff Outdated Show resolved Hide resolved
polyply/tests/test_apply_modifications.py Outdated Show resolved Hide resolved
polyply/tests/test_apply_modifications.py Outdated Show resolved Hide resolved
Copy link
Member

@fgrunewald fgrunewald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments about a potential strategy to set the correct default values for the terminal modifications.

There are two edge-cases where this strategy will fail:

  1. If a non-protein residue has an atom named BB. I think we should simply add protein resnames to the modification dict and check those. You can make a variable like in the ff file protein_resnames = 'ALA|LYS|...' and then when applying we simply check if the resnames match. Since Vermouth generates a vermouth.molecule.Choice object you need to do the check using Choice.match. Calling @pckroon for some advise of comparing resnames using choice objects.

  2. When a protein itp is provided to begin with. Like in the case of PEGylated proteins. However, these nodes have a special attribute from_itp at the meta_molecule level. Thus we can simply check if this attribute exists and if it does we do not patch.

Both checks I would add to apply modification and not to patch termini. If you need help or have questions let me know. Thanks for taking care of this.

polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
polyply/src/apply_modifications.py Outdated Show resolved Hide resolved
@pckroon
Copy link
Member

pckroon commented Jul 5, 2024

Since Vermouth generates a vermouth.molecule.Choice object you need to do the check using Choice.match. Calling @pckroon for some advise of comparing resnames using choice objects.

Use vermouth.molecule.attributes_match to compare node dictionaries. E.g. attributes_match(my_node, {'resname': Choice(['ALA', 'GLY']), 'resid': 23}. Note that all attributes specified in the second dictionary must be in the first dict (or have value None) and match. In the example that means that the dictionary my_node must have the attributes 'resname' and 'resid', and the resname must be either 'ALA' or 'GLY', and the resid must be 23.

- removed -ter flag from polyply
- -ter flag is not necessary anymore because apply_modifications.py will automatically add protein termini to the resspec if no modifications are given
- the modifications in the resspec will then only be applied if the target residue is in fact a protein residue
@csbrasnett
Copy link
Collaborator Author

I think this addresses everything now, let me know your thoughts. One point of concern - when I was looking at doing a test for the final missing part of the codecov report, I can't see a 'from_itp' key in the node dictionary for the test molecules, even though they have been? Looking at meta_molecule.py I'm not sure how/where this is added anyway?

Copy link
Member

@fgrunewald fgrunewald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csbrasnett only minor comments; and I want to wait for PR #381 to be merged because that brings a required bug fix to the template generation workflow.

polyply/tests/example_fixtures.py Show resolved Hide resolved
@csbrasnett csbrasnett merged commit c820aaa into marrink-lab:master Jul 26, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants