Skip to content

Commit

Permalink
Atomtype parameterization methods for parameterizing a topology in GM…
Browse files Browse the repository at this point in the history
…SO. (#644)

* Atomtype parameterization methods for parameterizing a topology in GMSO.

This PR will add the atomtyping module to GMSO for passing a GMSO.Forcefield
and a GMSO.Topology and match atomtype using foyer as the backend. Then,
the corresponding connection types will be found in the Forcefield and
applied to the connections in the topology.

* Create parameterize.py, which has the apply function which can take a
topology, and a gmso forcefield to apply to it. This can use subgraph
isomorphism to identify molecular structures in the topology through
the bondgraph and bin those into unique molecules that match a specified
forcefield. This apply function can also do the standard atomtyping of
the entire topology in one step.
* Create isomorph.py which uses networkx graphs to identify disconnected
components and isomorphism to identify repeated structures.

* Move module imports for apply into atomtyping to prevent circular imports

* Add a quick fix which will do atomtyping if no residue flag is passed

* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci

* correctly update sites when adding subtop

* Changes to doc strings for clarity. Add a subtop_label variable to generalize where the molecule definition is pulled from

* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci

* More modular architecture

* WIP- Add testing, minor refactoring of the APIs.

* WIP- Better error handling

* Misc Changes

1. Update Topology after parametrization
2. Add dependency for forcefield_utilities in env files
3. Add tests for trappe forcefield
4. Patch parmed (should be moved to a cal's PR #658

* WIP- Skip opls for now, full TrappE tests

* Avoid accidental overwriting of typemap when using isomorphism

* WIP- Remove unused import

* Use enumerate for atom index while converting to TopologyGraph

* Fix argument order

* WIP- Add test for subtopology parameterization

* Make opls/trappe global fixtures, Add tests for isomorphism

* Further testing isomorphism

* REVERT - skip OPLS tests

* Copy scaling factors and combining rules after parametrization

* Proper OPLS tests

* WIP- Refactor the test module

* WIP- Remove unused import

* WIP- Add test for parameterization with impropers

* WIP- Additional impropers test; Separate module for testing impropers

* Minor refacotors; additional edge cases coverage/tests

* Docstring minor fix

* Remove rel_to_module as is obsolete in forcefield_utilities

* Change trappe_ua to trappe-ua for correct loading

* fix typo, add note about specific use case

* pip install forcefield-utilites until new release

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Co Quach <daico007@gmail.com>
Co-authored-by: Co Quach <43968221+daico007@users.noreply.github.com>
Co-authored-by: Umesh Timalsina <umesh.timalsina@vanderbilt.edu>
  • Loading branch information
5 people authored Jun 15, 2022
1 parent d0efd62 commit 7fe2dc5
Show file tree
Hide file tree
Showing 24 changed files with 1,450 additions and 9 deletions.
4 changes: 3 additions & 1 deletion environment-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ dependencies:
- pytest
- mbuild >= 0.11.0
- openbabel >= 3.0.0
- foyer >= 0.9.4
- foyer >= 0.11.1
- gsd >= 2.0
- parmed >= 3.4.3
- pytest-cov
Expand All @@ -22,3 +22,5 @@ dependencies:
- ipywidgets
- ele >= 0.2.0
- pre-commit
- pip:
- "--editable=git+https://github.com/mosdef-hub/forcefield-utilities.git#egg=forcefield-utilities"
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ dependencies:
- pydantic < 1.9.0
- networkx
- ele >= 0.2.0
- forcefield-utilities
2 changes: 1 addition & 1 deletion gmso/core/topology.py
Original file line number Diff line number Diff line change
Expand Up @@ -844,7 +844,7 @@ def add_subtopology(self, subtop, update=True):
"""
self._subtops.add(subtop)
subtop.parent = self
self._sites.union(subtop.sites)
self._sites = self._sites.union(subtop.sites)
if update:
self.update_topology()

Expand Down
10 changes: 6 additions & 4 deletions gmso/external/convert_parmed.py
Original file line number Diff line number Diff line change
Expand Up @@ -588,10 +588,12 @@ def to_parmed(top, refer_type=True):
# Set up Parmed structure and define general properties
structure = pmd.Structure()
structure.title = top.name
structure.box = np.concatenate(
(
top.box.lengths.to("angstrom").value,
top.box.angles.to("degree").value,
structure.box = (
np.concatenate(
(
top.box.lengths.to("angstrom").value,
top.box.angles.to("degree").value,
)
)
if top.box
else None
Expand Down
2 changes: 2 additions & 0 deletions gmso/parameterization/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
"""GMSO functions that generate parameterized Topologies."""
from .parameterize import apply
114 changes: 114 additions & 0 deletions gmso/parameterization/foyer_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
"""Utilities for atomtyping a gmso topology with foyer."""
from collections import namedtuple

from foyer.atomtyper import AtomTypingRulesProvider, find_atomtypes
from foyer.exceptions import FoyerError
from foyer.topology_graph import TopologyGraph

from gmso.core.atom import Atom
from gmso.parameterization.subtopology_utils import subtop_bonds


def get_topology_graph(gmso_topology, atomdata_populator=None):
"""Return a TopologyGraph with relevant attributes from an GMSO topology.
Parameters
----------
gmso_topology: gmso.Topology-like
The GMSO Topology
atomdata_populator: callable, default=None
A function that will be called with the following arguments `gmso_topology` as well as `atom` to pass extra
arguments to the foyer.topology_graph.AtomData object
Notes
-----
The gmso topology here is a duck type.
Returns
-------
foyer.topology_graph.TopologyGraph
A light networkx representation of the topology
"""
top_graph = TopologyGraph()
atom_index_map = {}

if len(gmso_topology.sites) == 0:
raise FoyerError(
"Cannot create a topology graph from a topology with no sites."
)

for j, atom in enumerate(gmso_topology.sites):
atom_index_map[id(atom)] = j
if isinstance(atom, Atom):
kwargs = (
atomdata_populator(gmso_topology, atom)
if atomdata_populator
else {}
)
if atom.name.startswith("_"):
top_graph.add_atom(
name=atom.name,
index=j, # Assumes order is preserved
atomic_number=None,
element=atom.name,
**kwargs,
)

else:
top_graph.add_atom(
name=atom.name,
index=j, # Assumes order is preserved
atomic_number=atom.element.atomic_number,
element=atom.element.symbol,
**kwargs,
)

for top_bond in gmso_topology.bonds:
atoms_indices = [
atom_index_map[id(atom)] for atom in top_bond.connection_members
]
top_graph.add_bond(atoms_indices[0], atoms_indices[1])

return top_graph


def get_topology_graph_from_subtop(subtopology):
"""Get an equivalent topology graph for a sub-topology."""
subtop_named_tuple = namedtuple("subtopology", ("sites", "bonds"))
return get_topology_graph(
subtop_named_tuple(subtopology.sites, subtop_bonds(subtopology))
)


def get_atomtyping_rules_provider(gmso_ff):
"""Return a foyer AtomTypingRulesProvider from a GMSO forcefield.
Parameters
----------
gmso_ff: gmso.core.forcefield.Forcefield
The GMSO forcefield object to extract the rules from
Returns
-------
AtomTypingRulesProvider
The foyer.atomtyper.AtomTypingRulesProvider object used to parse atomtype definitions.
Typically, SMARTS is the ruleset of choice. See https://github.com/mosdef-hub/foyer/issues/63
for curently supported features in Foyer.
"""
atom_type_defs = {}
atom_type_overrides = {}
for atom_type_name, atom_type in gmso_ff.atom_types.items():
if atom_type.definition:
atom_type_defs[atom_type_name] = atom_type.definition
if atom_type.overrides:
atom_type_overrides[atom_type_name] = atom_type.overrides

return AtomTypingRulesProvider(
atom_type_defs, atom_type_overrides, gmso_ff.non_element_types
)


def typemap_dict(topology_graph, atomtyping_rules_provider, max_iter=10):
"""Return a dictionary of typemap, by finding atomtypes in foyer."""
return find_atomtypes(topology_graph, atomtyping_rules_provider, max_iter)
63 changes: 63 additions & 0 deletions gmso/parameterization/isomorph.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
"""TopologyGraph Functions that identify molecules from isomorphism."""
from collections import deque

import networkx as nx


def top_node_match(n1, n2):
"""Match two nodes in the topology graph based on their elements."""
return n1["atom_data"].element == n2["atom_data"].element


def partition_isomorphic_topology_graphs(graph):
"""Return a collection of isomorphic sets of the subgraphs of the Topology Graph.
Parameters
----------
graph: foyer.topology_graph.TopologyGraph
The networkx subclassed TopologyGraph with data identifying the nodes
and edges that make up a topology atom and bonds structure
Returns
-------
isomorphic_elements: dict
The keys are unique disconnected graphs, and the values are all
identical subgraphs in the graph
Notes
-----
See https://github.com/networkx/networkx/blob/main/networkx/algorithms/isomorphism/isomorphvf2.py
from the networkx documentation about identifying isomorphic components
"""
graph_queue = deque(
graph.subgraph(c) for c in nx.connected_components(graph)
)

graph_of_interest = graph_queue.popleft()
isomorphic_elements = {
graph_of_interest: [],
}

count = 0
first_mismatch = None

while graph_queue:
if graph_queue[0] == first_mismatch:
count = 0
graph_of_interest = graph_queue.popleft()
isomorphic_elements[graph_of_interest] = []
if graph_queue:
graph = graph_queue.popleft()
matcher = nx.algorithms.isomorphism.GraphMatcher(
graph, graph_of_interest, node_match=top_node_match
)
if matcher.is_isomorphic():
isomorphic_elements[graph_of_interest].append(
(graph, matcher.mapping)
)
else:
if count == 0:
first_mismatch = graph
graph_queue.append(graph)
count += 1
return isomorphic_elements
82 changes: 82 additions & 0 deletions gmso/parameterization/parameterize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
"""Functions used to atomtype a gmso.Topology."""
from gmso.parameterization.topology_parameterizer import (
TopologyParameterizationConfig,
TopologyParameterizer,
)

__all__ = ["apply"]


def apply(
top,
forcefields,
identify_connections=False,
identify_connected_components=True,
use_residue_info=False,
assert_bond_params=True,
assert_angle_params=True,
assert_dihedral_params=True,
assert_improper_params=False,
):
"""Set Topology parameter types from GMSO ForceFields.
Parameters
----------
top: gmso.core.topology.Topology, required
The GMSO topology on which to apply forcefields
forcefields: ForceField or dict, required
The forcefield to apply. If a dictionary is used the keys are labels that match
the subtopology name, and the values are gmso ForceField objects that gets applied
to the specified subtopology.
Note: if a Topology with no subtopologies is provided, this option will only take
a ForceField object. If a dictionary of ForceFields is provided, this method will
fail.
identify_connections: bool, optional, default=False
If true, add connections identified using networkx graph matching to match
the topology's bonding graph to smaller sub-graphs that correspond to an angle,
dihedral, improper etc
identify_connected_components: bool, optional, default=True
A flag to determine whether or not to search the topology for repeated disconnected
structures, otherwise known as molecules and type each molecule only once.
use_residue_info: bool, optional, default=False
A flag to determine whether or not to look at site.residue_name to look parameterize
each molecule only once. Currently unused.
assert_bond_params : bool, optional, default=True
If True, an error is raised if parameters are not found for all system
bonds.
assert_angle_params : bool, optional, default=True
If True, an error is raised if parameters are not found for all system
angles.
assert_dihedral_params : bool, optional, default=True
If True, an error is raised if parameters are not found for all system
proper dihedrals.
assert_improper_params : bool, optional, default=False
If True, an error is raised if parameters are not found for all system
improper dihedrals.
"""
config = TopologyParameterizationConfig.parse_obj(
dict(
identify_connections=identify_connections,
identify_connected_components=identify_connected_components,
use_residue_info=use_residue_info,
assert_bond_params=assert_bond_params,
assert_angle_params=assert_angle_params,
assert_dihedral_params=assert_dihedral_params,
assert_improper_params=assert_improper_params,
)
)
parameterizer = TopologyParameterizer(
topology=top, forcefields=forcefields, config=config
)

parameterizer.run_parameterization()

return parameterizer.topology
50 changes: 50 additions & 0 deletions gmso/parameterization/subtopology_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
"""Utilities for application of a particular forcefield to a subtopology."""


def _members_in_subtop(connection, subtop):
"""Check if all the members in a connection belong to a subtopology."""
return all(site in subtop.sites for site in connection.connection_members)


def _subtop_connections(subtop, attr):
"""Return all the connections belonging to a subtopology."""
return filter(
lambda conn: _members_in_subtop(conn, subtop),
getattr(subtop._parent, attr),
)


def subtop_bonds(subtop):
"""Given a subtopology, return its bonds."""
return _subtop_connections(subtop, "bonds")


def subtop_angles(subtop):
"""Given a subtopology, return its angles."""
return _subtop_connections(subtop, "angles")


def subtop_dihedrals(subtop):
"""Given a subtopology, return its dihedrals."""
return _subtop_connections(subtop, "dihedrals")


def subtop_impropers(subtop):
"""Given a subtopology, return its impropers."""
return _subtop_connections(subtop, "impropers")


def assert_no_boundary_bonds(subtop):
"""Given a subtopology, assert that no bonds exist between its sites and external sites."""
for bond in subtop._parent.bonds:
site_pairs = bond.connection_members
assertion_msg = "Site {} is in the subtopology {}, but its bonded partner {} is not."

if site_pairs[0] in subtop.sites:
assert site_pairs[1] in subtop.sites, assertion_msg.format(
site_pairs[0].name, subtop.name, site_pairs[1].name
)
elif site_pairs[1] in subtop.sites:
assert site_pairs[0] in subtop.sites, assertion_msg.format(
site_pairs[1].name, subtop.name, site_pairs[0].name
)
Loading

0 comments on commit 7fe2dc5

Please sign in to comment.