Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt cpp representation input paramaters to the ones on the python side #376

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

agoscinski
Copy link
Contributor

@agoscinski agoscinski commented Aug 24, 2021

Changes:
- cpp representation and kernel information is now included in rascal's model
json file.

- weights are now always flattened to one dimension for consistency in the
models format.

- update rascal models json in tests

This allows so simply adapt your model so it is usable with the lammps rascal interface

import rascal.utils
model = rascal.utils.load_obj('rascal_model.json')
rascal.utils.dump_obj('rascal_model.json', model)

I am aware that #305 discusses more general changes in the format, but this PR only adds cpp information to the model json, so I think we can have it as a separate PR. Also I would like that people are not required to just use a branch for 3 lines of code. I think PR #305 will still take some time until it is merged.

Tests in tests/python/md_calculator_test.py should cover the changes

! This PR does not break backwards compatibility to old rascal gap model jsons !

I decided to spent the time to adapt the c++ interface to the python interface, because it is too easy that undetected inconsistencies between cpp and pythoh hypers can happen. We could only do a consistency check on the python side, but from the c++ we cannot really do it (without putting similar effort as adapting hypers on both sides)and I see there a lot of annoying-to-debug problems arising, especially with default arguments on both sides.

Since even simple errors in the json file (e.g. typo in a json key as "cutof") are almost untrackable without printing the hell out of classes in the current state (basically the error messages says "there is somewhere something wrong with the json), I updated all error messages for reading the json file. This is also important for a user using the c++ interface to get meaningful errors. This makes our life dramatically easier to debug the tests. For this I just add the FILE and LINE information in each error similar is it is done in LAMMPS. I have a macro for FILENAME so it takes only the relative rascal path and not the absolute path (e.g. note "/ssd/local/code/librascal/src/rascal/structure_manager/structure_manager.hh" but only "src/rascal/structure_manager/structure_manager.hh")

So to sum up the two main changes now:

Additonal small changes:

  • Added a default value for the representation hypers gaussian_sigma_type to "Constant" which is the only possible value it can take.
  • coefficient_subselection can be now empty dict (which is it now set by default), because deleting it from hypers was inconsistent with how we treat optimization and cutoff_function_parameters.
  • handing a global_list which is not a list and it raises now error, before it was made to always to a list, even when it is not an int, which causes errors on the cpp side.
  • fixing update_hyperparameters

Supplementary

For transforming the ubjson reference data (in the cpp tests) I used the following script to not touch the reference results

import json
import ubjson

for filename in ['spherical_expansion_reference', 'spherical_invariants_reference', 'spherical_covariants_reference']:
    with open(filename + '.json') as f: 
        data = json.load(f) 
    for i in range(len(data['rep_info'])):
        for j in range(len(data['rep_info'][i])):
            # cutoff_function
            if 'cutoff_function' in data['rep_info'][i][j]['hypers']:
                data['rep_info'][i][j]['hypers']['interaction_cutoff'] = data['rep_info'][i][j]['hypers']['cutoff_function']['cutoff']['value']
                data['rep_info'][i][j]['hypers']['cutoff_smooth_width'] = data['rep_info'][i][j]['hypers']['cutoff_function']['smooth_width']['value']
                data['rep_info'][i][j]['hypers']['cutoff_function_type'] = data['rep_info'][i][j]['hypers']['cutoff_function']['type']
                if (data['rep_info'][i][j]['hypers']['cutoff_function']['type'] == "RadialScaling"):
                    cutoff_function_parameters = {key : data['rep_info'][i][j]['hypers']['cutoff_function'][key]['value'] for key in ['exponent', 'scale', 'rate']}
                    data['rep_info'][i][j]['hypers']['cutoff_function_parameters'] = cutoff_function_parameters
                data['rep_info'][i][j]['hypers'].pop('cutoff_function')

            # gaussian_density
            if 'gaussian_density' in data['rep_info'][i][j]['hypers']:
                data['rep_info'][i][j]['hypers']['gaussian_sigma_constant'] = data['rep_info'][i][j]['hypers']['gaussian_density']['gaussian_sigma']['value']
                data['rep_info'][i][j]['hypers']['gaussian_sigma_type'] = data['rep_info'][i][j]['hypers']['gaussian_density']['type']
                data['rep_info'][i][j]['hypers'].pop('gaussian_density')

            # radial_contribution
            if 'radial_contribution' in data['rep_info'][i][j]['hypers']:
                data['rep_info'][i][j]['hypers']['radial_basis'] = data['rep_info'][i][j]['hypers']['radial_contribution']['type']
                if 'optimization' in data['rep_info'][i][j]['hypers']['radial_contribution']:
                    data['rep_info'][i][j]['hypers']['optimization'] = data['rep_info'][i][j]['hypers']['radial_contribution']['optimization']
                data['rep_info'][i][j]['hypers'].pop('radial_contribution')

            # default parameter which have changed
            if not('compute_gradients' in data['rep_info'][i][j]['hypers']):
                data['rep_info'][i][j]['hypers']['compute_gradients'] = False;
            if not('expansion_by_species_method' in data['rep_info'][i][j]['hypers']):
                data['rep_info'][i][j]['hypers']['expansion_by_species_method'] = "environment wise";

    with open(filename + '.ubjson', "wb") as f: 
        ubjson.dump(data, f) 

same for the kernel_reference.ubjson for hypers in data['rep_info']['spherical_invariants'][i][j]['hypers_rep']

@agoscinski agoscinski requested a review from max-veit August 24, 2021 12:04
@agoscinski agoscinski requested review from max-veit and removed request for max-veit October 20, 2021 15:10
Copy link
Contributor

@max-veit max-veit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; my only worry is the duplication of information between the C++ and Python representation info. If someone were to decide to edit this file (JSON is human-readable, after all), this might result in a Python object with inconsistent parameters in C++, which would be bad. One way to prevent this would be to have a consistency check (i.e. that the C++ parameters match what is constructed using the Python parameters) - although ideally, we wouldn't need to duplicate this information at all, probably by just storing the C++ representation and having the Python load/save functions work with this format.

if "cpp_kernel" in data.keys():
self._kernel = self._kernel.from_dict(data["cpp_kernel"])
else:
print(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider replacing with LOGGER.warn() -- NB needs the following near the imports:

import logging
LOGGER = logging.getLogger(__name__)

(see e.g. bindings/rascal/utils/filter.py for an example of how this can be done)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added LOGGER

@@ -5236,4 +5332,4 @@
},
"module_name": "rascal.models.krr",
"version": "0.1"
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}

(git doesn't like it when files end without newlines)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@@ -5119,7 +5119,52 @@
"init_params": {
"representation": {
"class_name": "SphericalInvariants",
"data": {},
"data": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates all the information already contained below (in "init_params"), which is not ideal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue has been solved by adapting cpp input parameters to python input parameters

@agoscinski
Copy link
Contributor Author

agoscinski commented Dec 6, 2021

see updates PR summary at top

@agoscinski agoscinski changed the title Add cpp information to rascal model json Adapt cpp representation input paramaters to the ones on the python side Dec 6, 2021
@agoscinski agoscinski requested a review from max-veit December 6, 2021 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants