Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update molecule mapping #475

Merged
merged 84 commits into from
Jan 31, 2022
Merged

Update molecule mapping #475

merged 84 commits into from
Jan 31, 2022

Conversation

SamTov
Copy link
Member

@SamTov SamTov commented Jan 21, 2022

The idea in this PR is that I wanted to add an analysis of water as a test and example and in doing so, go through MDSuite functionality and identify bugs and repair them such that the study can continue. I will keep adding small notes on what has changed to aid in the review. It is nothing large just small bug fixes that arise when trying to use the software together. Along with the fixes, I am also writing specific tests to avoid these things from breaking in the future.

Summary of additions and changes

Anything Bold-fonted is now included to some extent in a test. To the extent that a test will fail if something that was fixed in this PR is violated in the future.

  • Allowed for building molecule from arbitrary species, i.e. one can now write {'HW1': 1, 'OW': 1, 'HW2': 1} and it will build water molecules with these species. This makes it easier to work with GROMACS where typically species are given as a ligand name or simple a numbered species based on its location in a molecule.
  • Removed molecules from the experiment species dict and updated transformations to look through both species and molecules dicts when performing transformation
  • Fixed bug in batching for angular distribution function that resulted in errors. This is resolved by setting the default mini batch in the ADF to -1 rather than 50 but I have also added a check in the get_triplets method to completely avoid this issue from occurring.
  • Adjust ADF such that it can work with molecules
  • Update transformations so that they work with molecules.
  • Add integration test to rigorously check that the correct water molecules are built from a GROMACS simulation
  • Add Functional test to check that MDSuite runs with and without molecules for several calculators
  • Adjusted length of some tqdm bars so that they remain at 70 columns
  • Added tutorial notebook for mapping molecules.
  • Adapted ADF to work with selected atoms
  • Started adapting RDF to work with selected atoms, see notes below.
  • Fix indexing error for molecule mapping that resulted in a false reference. This was visible, notably, in the centre molecule of the SDF.
  • Fix critical bug in the ensemble generation for remainders and for system-wide properties
  • Move Graph related operations to the graph module
  • Changed default chemfiles positions property to positions rather that unwrapped positions.
  • Removed indices from molecules, I believe closing Check missing indicies #432
  • Allowed for molecule pbc from gromacs

How to test this pull request

Tests will be added along with a notebook discussing the different methods for building molecules and how they work which can also go into the documentation.

Notes

Reviews are welcomed at any point in the PR as well as ideas for improving the mapping module as I go through it. I have some changes planned in terms of its overall functionality so perhaps it is better to wait for that but the choice is yours.

TODO

  • Fix issue in RDF for atom-selection. I think it has something to do with one entry having only a single atom as the distance tensor is computed correctly but the masking fails.
  • Resolve issue with chemfiles return. Update, see here
  • Include more tests for the graph module and update functional test with real data
  • Add proper data to DataHub and enforce stricter checks to ensure same bugs don't occur again. This is done but the tests not written as the coordination numbers are broken. TBF in another PR.

Problem with RDF

The RDF seems to return all zeros when trying to use selected atoms to compute it. The distance tensor is correct but during the binning method all values in the tensors are set to zero even if in the distance tensor they are within the cutoff.

Copy link
Member

@PythonFZ PythonFZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is still a draft, I don't really know what to review

mdsuite/transformations/map_molecules.py Outdated Show resolved Hide resolved
CI/functional_tests/test_water_study.py Outdated Show resolved Hide resolved
CI/unit_tests/graph_modules/test_molecular_graph.py Outdated Show resolved Hide resolved
mdsuite/transformations/map_molecules.py Outdated Show resolved Hide resolved
mdsuite/transformations/map_molecules.py Outdated Show resolved Hide resolved
mdsuite/transformations/map_molecules.py Outdated Show resolved Hide resolved
mdsuite/transformations/transformation_dict.py Outdated Show resolved Hide resolved
mdsuite/transformations/transformations.py Show resolved Hide resolved
@PythonFZ
Copy link
Member

The Example Notebook fails on Binder for the ADF calculation due to memory issues. This is not related to this PR, but I want to leave the link here so you can test out the changes in a clean environment Binder

@SamTov
Copy link
Member Author

SamTov commented Jan 31, 2022

The Example Notebook fails on Binder for the ADF calculation due to memory issues. This is not related to this PR, but I want to leave the link here so you can test out the changes in a clean environment Binder

This is however related to #476

Copy link
Member

@PythonFZ PythonFZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still struggeling with the dataclass tests but besides them looking forward to merge it.

CI/unit_tests/utils/test_meta_functions.py Outdated Show resolved Hide resolved
CI/unit_tests/utils/test_molecule_class.py Show resolved Hide resolved
CI/unit_tests/utils/test_molecule_class.py Show resolved Hide resolved
mdsuite/graph_modules/molecular_graph.py Outdated Show resolved Hide resolved
mdsuite/transformations/map_molecules.py Outdated Show resolved Hide resolved
@SamTov
Copy link
Member Author

SamTov commented Jan 31, 2022

I'm still struggeling with the dataclass tests but besides them looking forward to merge it.

The data class that you are referring to is the issue that you agreed to handle at a separate time and is open in #477. The initial points for which you wanted a data class are now a data class in the tests and in the code.

@PythonFZ PythonFZ mentioned this pull request Jan 31, 2022
@SamTov SamTov merged commit 6ed32d4 into main Jan 31, 2022
@SamTov SamTov deleted the SamTov_Molecule_Mapping branch January 31, 2022 14:45
@christophlohrmann
Copy link
Collaborator

this PR should have really been squashed before merge with all the back and forth + format litter

@PythonFZ
Copy link
Member

this PR should have really been squashed before merge with all the back and forth + format litter

I would propose to almost always use squash and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p0-urgent Must be fixed immediately.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check missing indicies
3 participants