-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
making PrePARE work with arbitrary CV #685
Comments
@larsbuntemeyer Could you provide some CORDEX dataset files that I could try to debug PrePARE with? Thank you. |
CMOR doesn't seem to use the values in the Regarding the parts where I think CMOR/PrePARE should be checking @durack1 obs4MIPS and input4MIPs don't have parent and sub experiments. How do they use CMOR/PrePARE? I agree with using command line arguments with PrePARE to get the right files for the CV. |
Hi @mauzey1 ! Thanks for picking this up again, sorry, for being unresponsive! I'll provide some CORDEX dataset soon!
Yes, that's confusing. It uses file and path templates from the dataset input table, not from the CV. On the other hand,
I think that is achieved by the |
I am little bit tempted to implement some backend CV check module purely in python that could be utilized by PrePARE. Json handling and NetCDF attribute handling has really become easy and flexible within python. |
@larsbuntemeyer That is actually an idea I have been contemplating. I was thinking of making one based on the xarray. We could discuss that more on the discussion page. As for the current issue with PrePARE, I wonder how we could tell if we are using a CV other than CMIP6 CV. I guess we could grab the prefix of the CV file assuming that the name of the CV file will always have the format @durack1 @matthew-mizielinski Do you know of any MIPs that use parent and sub experiments other than CMIP6? |
You folks are starting to get aligned with what we've been discussing. We are thinking of following a structure similar to If we follow this structure, then the Note, we are trying to glean as much information about CMOR usage in the draft CMOR Users Survey, so if we have more specific info we want to collect from the community as part of planning, then feel free to suggest augmentations/edits to the questions we have drafted |
@mauzey1 You can now download current CORDEX example datasets from artifacts, e.g., here: https://github.com/WCRP-CORDEX/cordex-cmip6-cmor-tables/actions/runs/4043896243 |
Parent information: yes, particularly where they are extending CMIP6, e.g. run ssp245 with sulphate injections to keep global sub-experiment id: yes, where there are initialised experiments such as those for the decadal/seasonal forecasting area. |
@matthew-mizielinski : Aren't those considered CMIP6 runs, even if they are not the DECK/historical runs? |
Sorry, I should have put examples. Both of the following are outside of CMIP6;
|
I'm currently making changes to PrePARE to make its checks more like those done when cmorizing datasets. Below is the relevant section of cmor.c that does those checks. Lines 3038 to 3062 in 304db22
The check for source ID is done if the |
That would also impact the cmorization itself, not only PrePARE, right? That would work for CORDEX with CMIP6. But i know that people are also using the old CVs with cmor3, in those, there is no |
@durack1 Why is CMOR checking if the Lines 3055 to 3062 in 304db22
|
@mauzey1 good question, the |
Getting back into working on this issue, I've dug deeper into how the Lines 373 to 377 in 8d45339
cmor_current_dataset.furtherinfourl was initialized with the default value for the URL template.Line 266 in 8d45339
cmor_current_dataset.furtherinfourl would either use the value of further_info_url stored in the CV file or in the user input if CMOR was being used to write data. However, PrePARE doesn't do this so it will just use the default template.
Another issue that I have noticed is the difference between the default value of the further info URL's template, and the value stored in the
Although this appears to be a regex for matching any URL that begins with Lines 379 to 385 in 8d45339
So even if PrePARE got the value for further_info_url from the CV, it would just be ignoring the URL if the value was not a template that had "<>" tokens like the default in cmor.h.
The check for the further info URL is pretty much hardwired for CMIP6, or at least any project that used the same template for further info URLs. Should we place the further info URL check within the group of CMIP6 checks, which can be disabled by the PrePARE user? In CMOR, the Lines 3055 to 3062 in 8d45339
This would mean that the checks for further info URLs and source_ids would be performed unless the user set further_info_url to be an emtpy string. I can understand doing this for the URL but not the source_id.
Shouldn't |
@mauzey1 thanks for continuing to chip away at these issues. |
And this is exactly correct. The way that we have things set up, if we don't have a source_id defined (in CVs) that matches the dataset being written, then the program should throw a warning at minimum. @taylor13 @matthew-mizielinski can confirm their agreement |
Here are some offline comments, I'm including in this thread. Hope it's the right thread. The global attributes specifications for CMIP6 included: The PrePARE default value is consistent with this BUT DOESN’T LOOK LIKE IT INCLUDES THE DOT (PERIOD) SEPARATORS: I agree with Paul that we want to check source_id independent of further_info_url. Reviewing some of the coding, it seems to me it is going to difficult to generalize it cleanly to handle other projects (obs4MIPs, input4MIPs, etc.). I have spent some time trying to write down what we would like to check with PrePARE and how we might code this in a general way to ingest simply formatted guidance from an input “configuration” file which would specify which attributes to check and how to check them. My ideas are contained in this document: https://docs.google.com/document/d/1JZcVRo2GucGUoZPeUNrmHmtB8LAs0QBAWiDuf0n3lSM/edit. You might also be interested in an earlier document I put together when PrePARE was originally being developed: https://docs.google.com/document/d/1d4_wdaY52xhLZBTeiTu2ZlpmpTO2Luqe47m2IX1WkWg . I think it is worth considering how difficult it would be to implement the above from "scratch" (as opposed to modifying the current PrePARE code). I think it might be easier to start over and might take less time than trying to clean up and generalize the existing coding. Perhaps we can all get together and you can propose what you think would be the best way to proceed. |
@mauzey1 it would be useful to update PrePARE alongside the last planned release of CMOR3 to enable consistency checking as much as possible - let's revisit this issue as the final changes for 3.9.0 are identified to see what can be achieved within available time/resources |
There is a lot of stuff considered above. Not sure we should spend resources generalizing PrePARE to handle the generality of the 3.9.0 release. Might want to wait on this for integration into the CMOR 4.0 development. @mauzey1, you might be in the best position to estimate how big the job. What's your advice on this? |
@taylor13 I agree with this, if we have a "final" CV template that we can build PrePARE4 (along CMOR4) to work with, then I suggest that's the best path forward, rather than attempting to wrangle two codebases (C - CMOR, and Python - PrePARE) together. @mauzey1 if you agree with this, we can tag this against the 4.0 milestone |
@durack1 Yes, I agree with your suggestion of making this a CMOR4 milestone. |
I would like to open this issue to collect some insight i drew from playing around with
PrePARE
using our preliminary CORDEX cmor tables. Although i can use cmor to rewrite data for projects without_cmip6_option="CMIP6"
, i can not usePrePARE
to evaluate files due to the following:PrePARE
always usesCMOR_DEFAULT_FILE_TEMPLATE
to evaluate the file name:cmor/LibCV/PrePARE/PrePARE.py
Lines 262 to 264 in 5f8759e
cmor_CV_checkFilename
work with a DRS from the CV? Originally thefile_name_template
was defined in the input dataset attributes table but can not be recovered from the files global attributes. So in general, i guess it would be nice to have the DRS available from the CV table instead of the input file (also for the cmorization itself probably?). I am not sure what the purpose of the DRS entry in the CV file is after all...PrePARE
does and the checks during cmorization are not totally consistent if files are cmorized without the_cmip6_option
. For example, those checks incmor/Src/cmor.c
Lines 3043 to 3050 in 5f8759e
_cmip6_option
which makes my CORDEX tables work with cmor. However,PrePARE
does not know about this and especially those checks incmor/LibCV/PrePARE/PrePARE.py
Lines 491 to 492 in 5f8759e
cmor/LibCV/PrePARE/PrePARE.py
Lines 510 to 511 in 5f8759e
PrePARE
assumes some hard coded rules for finding CV, coordinates, grid tables, etc.. e.g., incmor/LibCV/PrePARE/PrePARE.py
Lines 256 to 258 in 5f8759e
It seems that
PrePARE
is tightly coupled to CMIP6 vocabulary with_cmip6_option="CMIP6"
although i could make it run with CORDEX tables using a few hacks and i wanted to document some of the pitfalls here. This could probably be fixed quite easily using something like acmip6_option
as a command line argument forPrePARE
. However, it still feels not right...This is probably related to:
The text was updated successfully, but these errors were encountered: