Adapt cpp representation input paramaters to the ones on the python side #376

agoscinski · 2021-08-24T10:51:46Z

~~Changes:~~
- cpp representation and kernel information is now included in rascal's model
json file.
- weights are now always flattened to one dimension for consistency in the
models format.
~~- update rascal models json in tests~~

~~This allows so simply adapt your model so it is usable with the lammps rascal interface~~

import rascal.utils
model = rascal.utils.load_obj('rascal_model.json')
rascal.utils.dump_obj('rascal_model.json', model)

I am aware that #305 discusses more general changes in the format, but this PR only adds cpp information to the model json, so I think we can have it as a separate PR. Also I would like that people are not required to just use a branch for 3 lines of code. I think PR #305 will still take some time until it is merged.

~~Tests in tests/python/md_calculator_test.py should cover the changes~~

! This PR does not break backwards compatibility to old rascal gap model jsons !

I decided to spent the time to adapt the c++ interface to the python interface, because it is too easy that undetected inconsistencies between cpp and pythoh hypers can happen. We could only do a consistency check on the python side, but from the c++ we cannot really do it (without putting similar effort as adapting hypers on both sides)and I see there a lot of annoying-to-debug problems arising, especially with default arguments on both sides.

Since even simple errors in the json file (e.g. typo in a json key as "cutof") are almost untrackable without printing the hell out of classes in the current state (basically the error messages says "there is somewhere something wrong with the json), I updated all error messages for reading the json file. This is also important for a user using the c++ interface to get meaningful errors. This makes our life dramatically easier to debug the tests. For this I just add the FILE and LINE information in each error similar is it is done in LAMMPS. I have a macro for FILENAME so it takes only the relative rascal path and not the absolute path (e.g. note "/ssd/local/code/librascal/src/rascal/structure_manager/structure_manager.hh" but only "src/rascal/structure_manager/structure_manager.hh")

So to sum up the two main changes now:

change c++ hyper key names and default arguments to be the same as on the python side
user friendly json error messages for simple mistakes in hypers (includes fix for issue Segfault with bad values of max_radial/max_angular #393)

Additonal small changes:

Added a default value for the representation hypers gaussian_sigma_type to "Constant" which is the only possible value it can take.
coefficient_subselection can be now empty dict (which is it now set by default), because deleting it from hypers was inconsistent with how we treat optimization and cutoff_function_parameters.
handing a global_list which is not a list and it raises now error, before it was made to always to a list, even when it is not an int, which causes errors on the cpp side.
fixing update_hyperparameters

Supplementary

For transforming the ubjson reference data (in the cpp tests) I used the following script to not touch the reference results

import json
import ubjson

for filename in ['spherical_expansion_reference', 'spherical_invariants_reference', 'spherical_covariants_reference']:
    with open(filename + '.json') as f: 
        data = json.load(f) 
    for i in range(len(data['rep_info'])):
        for j in range(len(data['rep_info'][i])):
            # cutoff_function
            if 'cutoff_function' in data['rep_info'][i][j]['hypers']:
                data['rep_info'][i][j]['hypers']['interaction_cutoff'] = data['rep_info'][i][j]['hypers']['cutoff_function']['cutoff']['value']
                data['rep_info'][i][j]['hypers']['cutoff_smooth_width'] = data['rep_info'][i][j]['hypers']['cutoff_function']['smooth_width']['value']
                data['rep_info'][i][j]['hypers']['cutoff_function_type'] = data['rep_info'][i][j]['hypers']['cutoff_function']['type']
                if (data['rep_info'][i][j]['hypers']['cutoff_function']['type'] == "RadialScaling"):
                    cutoff_function_parameters = {key : data['rep_info'][i][j]['hypers']['cutoff_function'][key]['value'] for key in ['exponent', 'scale', 'rate']}
                    data['rep_info'][i][j]['hypers']['cutoff_function_parameters'] = cutoff_function_parameters
                data['rep_info'][i][j]['hypers'].pop('cutoff_function')

            # gaussian_density
            if 'gaussian_density' in data['rep_info'][i][j]['hypers']:
                data['rep_info'][i][j]['hypers']['gaussian_sigma_constant'] = data['rep_info'][i][j]['hypers']['gaussian_density']['gaussian_sigma']['value']
                data['rep_info'][i][j]['hypers']['gaussian_sigma_type'] = data['rep_info'][i][j]['hypers']['gaussian_density']['type']
                data['rep_info'][i][j]['hypers'].pop('gaussian_density')

            # radial_contribution
            if 'radial_contribution' in data['rep_info'][i][j]['hypers']:
                data['rep_info'][i][j]['hypers']['radial_basis'] = data['rep_info'][i][j]['hypers']['radial_contribution']['type']
                if 'optimization' in data['rep_info'][i][j]['hypers']['radial_contribution']:
                    data['rep_info'][i][j]['hypers']['optimization'] = data['rep_info'][i][j]['hypers']['radial_contribution']['optimization']
                data['rep_info'][i][j]['hypers'].pop('radial_contribution')

            # default parameter which have changed
            if not('compute_gradients' in data['rep_info'][i][j]['hypers']):
                data['rep_info'][i][j]['hypers']['compute_gradients'] = False;
            if not('expansion_by_species_method' in data['rep_info'][i][j]['hypers']):
                data['rep_info'][i][j]['hypers']['expansion_by_species_method'] = "environment wise";

    with open(filename + '.ubjson', "wb") as f: 
        ubjson.dump(data, f)

same for the kernel_reference.ubjson for hypers in data['rep_info']['spherical_invariants'][i][j]['hypers_rep']

max-veit

Looks good; my only worry is the duplication of information between the C++ and Python representation info. If someone were to decide to edit this file (JSON is human-readable, after all), this might result in a Python object with inconsistent parameters in C++, which would be bad. One way to prevent this would be to have a consistency check (i.e. that the C++ parameters match what is constructed using the Python parameters) - although ideally, we wouldn't need to duplicate this information at all, probably by just storing the C++ representation and having the Python load/save functions work with this format.

max-veit · 2021-11-25T17:30:20Z

bindings/rascal/models/kernels.py

+        if "cpp_kernel" in data.keys():
+            self._kernel = self._kernel.from_dict(data["cpp_kernel"])
+        else:
+            print(


Consider replacing with LOGGER.warn() -- NB needs the following near the imports:

import logging LOGGER = logging.getLogger(__name__)

(see e.g. bindings/rascal/utils/filter.py for an example of how this can be done)

added LOGGER

max-veit · 2021-11-25T17:31:16Z

reference_data/tests_only/simple_gap_model.json

@@ -5236,4 +5332,4 @@
  },
  "module_name": "rascal.models.krr",
  "version": "0.1"
-}
+}


Suggested change

}

}

(git doesn't like it when files end without newlines)

max-veit · 2021-11-25T17:36:03Z

reference_data/tests_only/simple_gap_model.json

@@ -5119,7 +5119,52 @@
      "init_params": {
        "representation": {
          "class_name": "SphericalInvariants",
-          "data": {},
+          "data": {


This duplicates all the information already contained below (in "init_params"), which is not ideal

issue has been solved by adapting cpp input parameters to python input parameters

* cpp representation and kernel information is now included in rascal's model json file. * weights are now always flattened to one dimension for consistency in the models format.

…; make gaussian_sigma_type Constant default parameter

… cpp side for global_species coefficient_subselection optimization

agoscinski · 2021-12-06T16:13:48Z

see updates PR summary at top

agoscinski requested a review from max-veit August 24, 2021 12:04

agoscinski mentioned this pull request Sep 1, 2021

Changes for lammps integration #367

Draft

agoscinski requested review from max-veit and removed request for max-veit October 20, 2021 15:10

max-veit reviewed Nov 25, 2021

View reviewed changes

agoscinski added 13 commits December 1, 2021 12:39

add cpp information to rascal model json

2a879b7

* cpp representation and kernel information is now included in rascal's model json file. * weights are now always flattened to one dimension for consistency in the models format.

update rascal models json in tests

32a522f

change warning to logger

ee6fc99

adapt cpp hypers to python hypers for invariants and expansion coeffs…

26456d7

…; make gaussian_sigma_type Constant default parameter

debug tests

fa53a04

finished debugging

fb387c2

adapt tests and inputs

ecb70b8

adapt python classes; empty lists and dict is now properly handled on…

550b75a

… cpp side for global_species coefficient_subselection optimization

cleaning commented code

f4dcde9

fix global species

9061314

reset simple_gap_model json

ad67a18

order representation input parameters the same way

07e8ecb

fix typo

fec2c0c

agoscinski force-pushed the feat/cpp-objects-to-rascal-model branch from 50c0ccf to fec2c0c Compare December 1, 2021 11:39

agoscinski mentioned this pull request Dec 6, 2021

Package name rascal is already taken on PyPI :( #362

Open

agoscinski added 3 commits December 6, 2021 17:00

adapt changed cpp hyperparamaters in benchmarks

6213b64

improved read_hyperparameter documentation and parameter names`

4d25fa2

fix undef SOURCE_PATH_SIZE for clang compilers

93da3ce

agoscinski mentioned this pull request Dec 6, 2021

Segfault with bad values of max_radial/max_angular #393

Open

agoscinski added 2 commits December 6, 2021 17:17

remove remains from first approach

b9f5dec

add new line fo simple gap model

0db77ae

agoscinski changed the title ~~Add cpp information to rascal model json~~ Adapt cpp representation input paramaters to the ones on the python side Dec 6, 2021

agoscinski requested a review from max-veit December 6, 2021 17:15

agoscinski mentioned this pull request Jun 14, 2022

Adapt cpp representation input paramaters compatible with the current lammps-rascal interface #414

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt cpp representation input paramaters to the ones on the python side #376

Adapt cpp representation input paramaters to the ones on the python side #376

agoscinski commented Aug 24, 2021 •

edited

Loading

max-veit left a comment

max-veit Nov 25, 2021

agoscinski Dec 6, 2021

max-veit Nov 25, 2021

agoscinski Dec 6, 2021

max-veit Nov 25, 2021

agoscinski Dec 6, 2021

agoscinski commented Dec 6, 2021 •

edited

Loading

Adapt cpp representation input paramaters to the ones on the python side #376

Are you sure you want to change the base?

Adapt cpp representation input paramaters to the ones on the python side #376

Conversation

agoscinski commented Aug 24, 2021 • edited Loading

! This PR does not break backwards compatibility to old rascal gap model jsons !

Additonal small changes:

Supplementary

max-veit left a comment

Choose a reason for hiding this comment

max-veit Nov 25, 2021

Choose a reason for hiding this comment

agoscinski Dec 6, 2021

Choose a reason for hiding this comment

max-veit Nov 25, 2021

Choose a reason for hiding this comment

agoscinski Dec 6, 2021

Choose a reason for hiding this comment

max-veit Nov 25, 2021

Choose a reason for hiding this comment

agoscinski Dec 6, 2021

Choose a reason for hiding this comment

agoscinski commented Dec 6, 2021 • edited Loading

agoscinski commented Aug 24, 2021 •

edited

Loading

agoscinski commented Dec 6, 2021 •

edited

Loading