Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Enable access to OmegaConfigLoader DictConfig instead of dict only #2973

Open
Galileo-Galilei opened this issue Aug 24, 2023 · 4 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@Galileo-Galilei
Copy link
Member

Galileo-Galilei commented Aug 24, 2023

Description

As a plugin developer, I have several use cases for which I want to have control over configuration at runtime. With the new OmegaConfigLoader, configuration is available in context.config_loader["catalog"], which returns a dictionary with all key already resolved.

I'd like to access the "raw" configuration (with interpolated variables not resolved yet) and not the fully resolved ones. I could reload the yaml file, but kedro performs hidden operations to merge environments in load_and_merge_dir_config that I don't want to reproduce because it is hard to maintain and error prone.

Context

One use case I have is to identify "globals" variables use for the specific pipeline to log them in mlflow (I don't want to log all the "globals.yml" file to avoid polluting my mlflow with globals unrelated to my pipeline). In order to do this, I'd like to check in the "raw" configuration which key will be resolved with the globals resolver

OmegaConf stores configuration in a special object called DictConfig. This object has several helpers methods, particularly to help to traverse the tree structure, to check if a key is interpolated... I'd like to access this object directly to benefits from :

Possible Implementation

The DictConfig object is propagated all way long in load_and_merge_dir_config, and it is converted back to a dictionary in the return statement with the to_container. It is converted on the fly in the __getitem__ method.

I think a potential implementation (this a simplified sketch of what it could look like and I don't have all the implications in mind) could look like:

  • Add a load_dict_config (name TBD) methods to OmegaConfigLoader that basically implements the current __getitem__ method without converting config back to a container
  • simplify __getitem__ to only perform the conversion, e.g. something like:
def __getitem__(self, key):
    OmegaConf.to_container(self.load_dict_config(key), resolve=True)

Is this something you might consider of interest, or ever already considered ?

@Galileo-Galilei Galileo-Galilei added the Issue: Feature Request New feature or improvement to existing feature label Aug 24, 2023
@Galileo-Galilei Galileo-Galilei changed the title Make ConfigLoader return DictConfig instead of dict only Enable access to OmegaConfigLoader DictConfig instead of dict only Aug 24, 2023
@merelcht
Copy link
Member

Thanks for raising this request @Galileo-Galilei ! I definitely see the value in adding this. We are currently focussing on getting our next breaking release 0.19.0 out, so we'll prioritise this for after the release. To manage expectations, we're aiming to release 0.19.0 before the end of the year. Of course, we'd be more than happy to accept a contribution for this feature if you want it sooner 😄

@datajoely
Copy link
Contributor

Just from the various issues I'm subscribed to - it feels like this is something we need to prioritise

@astrojuanlu
Copy link
Member

Moving this back to our Inbox so we can discuss it and prioritise it again.

@merelcht merelcht changed the title Enable access to OmegaConfigLoader DictConfig instead of dict only Spike: Enable access to OmegaConfigLoader DictConfig instead of dict only May 20, 2024
@astrojuanlu
Copy link
Member

astrojuanlu commented Dec 23, 2024

What could be a huge improvement, at least in my case, is to keep using OmegaConf objects in the rest of the kedro project (as opposed to dicts). This will probably be a major backend change but you would then postpone to_container calls as long as possible (and in our case skipping many as we only use a portion of the catalog on every kedro run)

by @MatthiasRoels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
Status: No status
Development

No branches or pull requests

5 participants