-
Notifications
You must be signed in to change notification settings - Fork 33
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Atomtype parameterization methods for parameterizing a topology in GM…
…SO. (#644) * Atomtype parameterization methods for parameterizing a topology in GMSO. This PR will add the atomtyping module to GMSO for passing a GMSO.Forcefield and a GMSO.Topology and match atomtype using foyer as the backend. Then, the corresponding connection types will be found in the Forcefield and applied to the connections in the topology. * Create parameterize.py, which has the apply function which can take a topology, and a gmso forcefield to apply to it. This can use subgraph isomorphism to identify molecular structures in the topology through the bondgraph and bin those into unique molecules that match a specified forcefield. This apply function can also do the standard atomtyping of the entire topology in one step. * Create isomorph.py which uses networkx graphs to identify disconnected components and isomorphism to identify repeated structures. * Move module imports for apply into atomtyping to prevent circular imports * Add a quick fix which will do atomtyping if no residue flag is passed * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * correctly update sites when adding subtop * Changes to doc strings for clarity. Add a subtop_label variable to generalize where the molecule definition is pulled from * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * More modular architecture * WIP- Add testing, minor refactoring of the APIs. * WIP- Better error handling * Misc Changes 1. Update Topology after parametrization 2. Add dependency for forcefield_utilities in env files 3. Add tests for trappe forcefield 4. Patch parmed (should be moved to a cal's PR #658 * WIP- Skip opls for now, full TrappE tests * Avoid accidental overwriting of typemap when using isomorphism * WIP- Remove unused import * Use enumerate for atom index while converting to TopologyGraph * Fix argument order * WIP- Add test for subtopology parameterization * Make opls/trappe global fixtures, Add tests for isomorphism * Further testing isomorphism * REVERT - skip OPLS tests * Copy scaling factors and combining rules after parametrization * Proper OPLS tests * WIP- Refactor the test module * WIP- Remove unused import * WIP- Add test for parameterization with impropers * WIP- Additional impropers test; Separate module for testing impropers * Minor refacotors; additional edge cases coverage/tests * Docstring minor fix * Remove rel_to_module as is obsolete in forcefield_utilities * Change trappe_ua to trappe-ua for correct loading * fix typo, add note about specific use case * pip install forcefield-utilites until new release Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Co Quach <daico007@gmail.com> Co-authored-by: Co Quach <43968221+daico007@users.noreply.github.com> Co-authored-by: Umesh Timalsina <umesh.timalsina@vanderbilt.edu>
- Loading branch information
1 parent
d0efd62
commit 7fe2dc5
Showing
24 changed files
with
1,450 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,3 +10,4 @@ dependencies: | |
- pydantic < 1.9.0 | ||
- networkx | ||
- ele >= 0.2.0 | ||
- forcefield-utilities |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
"""GMSO functions that generate parameterized Topologies.""" | ||
from .parameterize import apply |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
"""Utilities for atomtyping a gmso topology with foyer.""" | ||
from collections import namedtuple | ||
|
||
from foyer.atomtyper import AtomTypingRulesProvider, find_atomtypes | ||
from foyer.exceptions import FoyerError | ||
from foyer.topology_graph import TopologyGraph | ||
|
||
from gmso.core.atom import Atom | ||
from gmso.parameterization.subtopology_utils import subtop_bonds | ||
|
||
|
||
def get_topology_graph(gmso_topology, atomdata_populator=None): | ||
"""Return a TopologyGraph with relevant attributes from an GMSO topology. | ||
Parameters | ||
---------- | ||
gmso_topology: gmso.Topology-like | ||
The GMSO Topology | ||
atomdata_populator: callable, default=None | ||
A function that will be called with the following arguments `gmso_topology` as well as `atom` to pass extra | ||
arguments to the foyer.topology_graph.AtomData object | ||
Notes | ||
----- | ||
The gmso topology here is a duck type. | ||
Returns | ||
------- | ||
foyer.topology_graph.TopologyGraph | ||
A light networkx representation of the topology | ||
""" | ||
top_graph = TopologyGraph() | ||
atom_index_map = {} | ||
|
||
if len(gmso_topology.sites) == 0: | ||
raise FoyerError( | ||
"Cannot create a topology graph from a topology with no sites." | ||
) | ||
|
||
for j, atom in enumerate(gmso_topology.sites): | ||
atom_index_map[id(atom)] = j | ||
if isinstance(atom, Atom): | ||
kwargs = ( | ||
atomdata_populator(gmso_topology, atom) | ||
if atomdata_populator | ||
else {} | ||
) | ||
if atom.name.startswith("_"): | ||
top_graph.add_atom( | ||
name=atom.name, | ||
index=j, # Assumes order is preserved | ||
atomic_number=None, | ||
element=atom.name, | ||
**kwargs, | ||
) | ||
|
||
else: | ||
top_graph.add_atom( | ||
name=atom.name, | ||
index=j, # Assumes order is preserved | ||
atomic_number=atom.element.atomic_number, | ||
element=atom.element.symbol, | ||
**kwargs, | ||
) | ||
|
||
for top_bond in gmso_topology.bonds: | ||
atoms_indices = [ | ||
atom_index_map[id(atom)] for atom in top_bond.connection_members | ||
] | ||
top_graph.add_bond(atoms_indices[0], atoms_indices[1]) | ||
|
||
return top_graph | ||
|
||
|
||
def get_topology_graph_from_subtop(subtopology): | ||
"""Get an equivalent topology graph for a sub-topology.""" | ||
subtop_named_tuple = namedtuple("subtopology", ("sites", "bonds")) | ||
return get_topology_graph( | ||
subtop_named_tuple(subtopology.sites, subtop_bonds(subtopology)) | ||
) | ||
|
||
|
||
def get_atomtyping_rules_provider(gmso_ff): | ||
"""Return a foyer AtomTypingRulesProvider from a GMSO forcefield. | ||
Parameters | ||
---------- | ||
gmso_ff: gmso.core.forcefield.Forcefield | ||
The GMSO forcefield object to extract the rules from | ||
Returns | ||
------- | ||
AtomTypingRulesProvider | ||
The foyer.atomtyper.AtomTypingRulesProvider object used to parse atomtype definitions. | ||
Typically, SMARTS is the ruleset of choice. See https://github.com/mosdef-hub/foyer/issues/63 | ||
for curently supported features in Foyer. | ||
""" | ||
atom_type_defs = {} | ||
atom_type_overrides = {} | ||
for atom_type_name, atom_type in gmso_ff.atom_types.items(): | ||
if atom_type.definition: | ||
atom_type_defs[atom_type_name] = atom_type.definition | ||
if atom_type.overrides: | ||
atom_type_overrides[atom_type_name] = atom_type.overrides | ||
|
||
return AtomTypingRulesProvider( | ||
atom_type_defs, atom_type_overrides, gmso_ff.non_element_types | ||
) | ||
|
||
|
||
def typemap_dict(topology_graph, atomtyping_rules_provider, max_iter=10): | ||
"""Return a dictionary of typemap, by finding atomtypes in foyer.""" | ||
return find_atomtypes(topology_graph, atomtyping_rules_provider, max_iter) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
"""TopologyGraph Functions that identify molecules from isomorphism.""" | ||
from collections import deque | ||
|
||
import networkx as nx | ||
|
||
|
||
def top_node_match(n1, n2): | ||
"""Match two nodes in the topology graph based on their elements.""" | ||
return n1["atom_data"].element == n2["atom_data"].element | ||
|
||
|
||
def partition_isomorphic_topology_graphs(graph): | ||
"""Return a collection of isomorphic sets of the subgraphs of the Topology Graph. | ||
Parameters | ||
---------- | ||
graph: foyer.topology_graph.TopologyGraph | ||
The networkx subclassed TopologyGraph with data identifying the nodes | ||
and edges that make up a topology atom and bonds structure | ||
Returns | ||
------- | ||
isomorphic_elements: dict | ||
The keys are unique disconnected graphs, and the values are all | ||
identical subgraphs in the graph | ||
Notes | ||
----- | ||
See https://github.com/networkx/networkx/blob/main/networkx/algorithms/isomorphism/isomorphvf2.py | ||
from the networkx documentation about identifying isomorphic components | ||
""" | ||
graph_queue = deque( | ||
graph.subgraph(c) for c in nx.connected_components(graph) | ||
) | ||
|
||
graph_of_interest = graph_queue.popleft() | ||
isomorphic_elements = { | ||
graph_of_interest: [], | ||
} | ||
|
||
count = 0 | ||
first_mismatch = None | ||
|
||
while graph_queue: | ||
if graph_queue[0] == first_mismatch: | ||
count = 0 | ||
graph_of_interest = graph_queue.popleft() | ||
isomorphic_elements[graph_of_interest] = [] | ||
if graph_queue: | ||
graph = graph_queue.popleft() | ||
matcher = nx.algorithms.isomorphism.GraphMatcher( | ||
graph, graph_of_interest, node_match=top_node_match | ||
) | ||
if matcher.is_isomorphic(): | ||
isomorphic_elements[graph_of_interest].append( | ||
(graph, matcher.mapping) | ||
) | ||
else: | ||
if count == 0: | ||
first_mismatch = graph | ||
graph_queue.append(graph) | ||
count += 1 | ||
return isomorphic_elements |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
"""Functions used to atomtype a gmso.Topology.""" | ||
from gmso.parameterization.topology_parameterizer import ( | ||
TopologyParameterizationConfig, | ||
TopologyParameterizer, | ||
) | ||
|
||
__all__ = ["apply"] | ||
|
||
|
||
def apply( | ||
top, | ||
forcefields, | ||
identify_connections=False, | ||
identify_connected_components=True, | ||
use_residue_info=False, | ||
assert_bond_params=True, | ||
assert_angle_params=True, | ||
assert_dihedral_params=True, | ||
assert_improper_params=False, | ||
): | ||
"""Set Topology parameter types from GMSO ForceFields. | ||
Parameters | ||
---------- | ||
top: gmso.core.topology.Topology, required | ||
The GMSO topology on which to apply forcefields | ||
forcefields: ForceField or dict, required | ||
The forcefield to apply. If a dictionary is used the keys are labels that match | ||
the subtopology name, and the values are gmso ForceField objects that gets applied | ||
to the specified subtopology. | ||
Note: if a Topology with no subtopologies is provided, this option will only take | ||
a ForceField object. If a dictionary of ForceFields is provided, this method will | ||
fail. | ||
identify_connections: bool, optional, default=False | ||
If true, add connections identified using networkx graph matching to match | ||
the topology's bonding graph to smaller sub-graphs that correspond to an angle, | ||
dihedral, improper etc | ||
identify_connected_components: bool, optional, default=True | ||
A flag to determine whether or not to search the topology for repeated disconnected | ||
structures, otherwise known as molecules and type each molecule only once. | ||
use_residue_info: bool, optional, default=False | ||
A flag to determine whether or not to look at site.residue_name to look parameterize | ||
each molecule only once. Currently unused. | ||
assert_bond_params : bool, optional, default=True | ||
If True, an error is raised if parameters are not found for all system | ||
bonds. | ||
assert_angle_params : bool, optional, default=True | ||
If True, an error is raised if parameters are not found for all system | ||
angles. | ||
assert_dihedral_params : bool, optional, default=True | ||
If True, an error is raised if parameters are not found for all system | ||
proper dihedrals. | ||
assert_improper_params : bool, optional, default=False | ||
If True, an error is raised if parameters are not found for all system | ||
improper dihedrals. | ||
""" | ||
config = TopologyParameterizationConfig.parse_obj( | ||
dict( | ||
identify_connections=identify_connections, | ||
identify_connected_components=identify_connected_components, | ||
use_residue_info=use_residue_info, | ||
assert_bond_params=assert_bond_params, | ||
assert_angle_params=assert_angle_params, | ||
assert_dihedral_params=assert_dihedral_params, | ||
assert_improper_params=assert_improper_params, | ||
) | ||
) | ||
parameterizer = TopologyParameterizer( | ||
topology=top, forcefields=forcefields, config=config | ||
) | ||
|
||
parameterizer.run_parameterization() | ||
|
||
return parameterizer.topology |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
"""Utilities for application of a particular forcefield to a subtopology.""" | ||
|
||
|
||
def _members_in_subtop(connection, subtop): | ||
"""Check if all the members in a connection belong to a subtopology.""" | ||
return all(site in subtop.sites for site in connection.connection_members) | ||
|
||
|
||
def _subtop_connections(subtop, attr): | ||
"""Return all the connections belonging to a subtopology.""" | ||
return filter( | ||
lambda conn: _members_in_subtop(conn, subtop), | ||
getattr(subtop._parent, attr), | ||
) | ||
|
||
|
||
def subtop_bonds(subtop): | ||
"""Given a subtopology, return its bonds.""" | ||
return _subtop_connections(subtop, "bonds") | ||
|
||
|
||
def subtop_angles(subtop): | ||
"""Given a subtopology, return its angles.""" | ||
return _subtop_connections(subtop, "angles") | ||
|
||
|
||
def subtop_dihedrals(subtop): | ||
"""Given a subtopology, return its dihedrals.""" | ||
return _subtop_connections(subtop, "dihedrals") | ||
|
||
|
||
def subtop_impropers(subtop): | ||
"""Given a subtopology, return its impropers.""" | ||
return _subtop_connections(subtop, "impropers") | ||
|
||
|
||
def assert_no_boundary_bonds(subtop): | ||
"""Given a subtopology, assert that no bonds exist between its sites and external sites.""" | ||
for bond in subtop._parent.bonds: | ||
site_pairs = bond.connection_members | ||
assertion_msg = "Site {} is in the subtopology {}, but its bonded partner {} is not." | ||
|
||
if site_pairs[0] in subtop.sites: | ||
assert site_pairs[1] in subtop.sites, assertion_msg.format( | ||
site_pairs[0].name, subtop.name, site_pairs[1].name | ||
) | ||
elif site_pairs[1] in subtop.sites: | ||
assert site_pairs[0] in subtop.sites, assertion_msg.format( | ||
site_pairs[1].name, subtop.name, site_pairs[0].name | ||
) |
Oops, something went wrong.