Multiple model cases/ensembles #148
Replies: 12 comments 5 replies
-
Hi @bitterbark , Considering multiple model cases is a thornier issue, as that breaks the framework's current design assumptions: namely, that the framework is essentially called separately for each case (since we need to do data search and preprocessing for each case before the PODs are run) and each run of a POD is independent (they don't persist data between invocations.) |
Beta Was this translation helpful? Give feedback.
-
Great idea on the optional ensemble member numbers. That should be easy for me to implement. As for model cases, the incoming AMOC POD has an easy way of dealing with it. It runs independently over each model case into a unique working directory, and then checks if there are other cases availble to plot. If so, it produces a 'summary' html file in addition to the one for the individual cases.http://www.cgd.ucar.edu/cms/bundy/Projects/diagnostics/mdtf/mdtf_figures/AMOC/MDTF_GFDL-CM2p1/AMOC_3D_Structure/AMOC_3D_Structure_Summary.html |
Beta Was this translation helpful? Give feedback.
-
I don't think we should specify ensemble members. I think it should be more flexible. Dani and I were talking, and I think we should just have the case names as an array, not just a single value. PODs that are built that take a single value would just get array element zero, while the others can get the complete array. This is how we have done things in the past, and I think is an easy path forward. Then it could be an ensemble or different cases. There may also be other ways to specify ensemble members (Dani and I discussed this). We also could use this as a path forward to allow existing pods to loop over these names, produce plots and arrange them on a multiple web page. Thoughts? |
Beta Was this translation helpful? Give feedback.
-
Regarding ensembles, assuming we have a list of multiple cases, we could have an optional attribute ensemble_name = string. Any pod that handles ensembles can check this, and group cases together accordingly. Any other pod can just treat them as individual cases. I can implement this in Rich Neale's blocking pod as an example of how to do it (with the group's approval on methodology). |
Beta Was this translation helpful? Give feedback.
-
Comments from J. David Neelin: Path to multiple runs
|
Beta Was this translation helpful? Give feedback.
-
I'm advancing the following proposal as a starting point for further discussion, since I think it offers a concrete way to implement the desired functionality.
Below is an ASCII flowchart describing how execution of two PODS (A and B) on two model runs (1 and 2) would look under this proposal. Every task (piece of text) can execute independently of all others, provided its upstream dependencies have finished.
This is the simplest way I can come up with to accomplish everything we want to do with regards to multiple runs, but as you can see it still adds a lot of complexity to the package's execution. For this reason, I'd advocate first adapting the package to use a third-party embeddable data pipeline tool, such as luigi -- see related remarks here. |
Beta Was this translation helpful? Give feedback.
-
Comments made during the mini-leads meeting on 29 March:
|
Beta Was this translation helpful? Give feedback.
-
Sorry I missed the call today. One other thing we have done is that the 'baseline' or reference run could also be labeled 'observations' and that holds a collection of observations for differencing (we did this for example with ERA-Interim reanalysis data). |
Beta Was this translation helpful? Give feedback.
-
Discussion at MinLeads Meeting, Oct 13: Allison Wing
Paul Ullrich: There's a hierarchy here: (1) PODs that only compare model to obs, (2) PODs that can run datasets in an embarrassingly parallel manner, (3) PODs that internally manage delegation John Krasting I support getting requirement from POD developers, but that means we really need them to get cracking early on their PODs. We can't wait until the last year of the phase for them to complete their PODs and say "oh, by the way, we need the framework to do X, Y, and Z ..." Paul Ullrich John Krasting Andrew Gettelman: |
Beta Was this translation helpful? Give feedback.
-
Thanks @wrongkindofdoctor. Happy to help with design: I agree with @jkrasting 's idea about a simple test case. The test case should not be convolved right now with more complex ideas for existing PODs, let's start with getting the framework to do something very (even trivially) simple. The CMIP6 example is a good one: just plot surface temperature (could even be zonal means: all on one plot) from a few CMIP6 models and don't break anything else. My 2 cents. |
Beta Was this translation helpful? Give feedback.
-
Summarizing a few items from above:
Example from Allison, consistent with Fiaz' POD running on CMIP6
From Paul mini-Leads October 13, hierarchy:
|
Beta Was this translation helpful? Give feedback.
-
For info, based on the group input here and in meetings-- Here is the model we will work on for the prototype. Feedback will be requested after basic prototyping is complete. Model 2 provides capability for the MDTF framework to pass along input specifications that may span more than one input source (e.g. multiple ensemble members of an experiment) to a POD directly from the framework. In this case, a POD internally handles the processing of diagnostics from multiple experiments in it’s own fashion (serial or parallel). The pod handles the compilation and generation of output to a web page. At the end of the results generation phase, MDTF framework presents the final webpage with links to each pod’s individual webpage. The input settings will use the existing structure with the addition of new case descriptions in the case_list More info and other use-cases to be discussed after the prototype. |
Beta Was this translation helpful? Give feedback.
-
From @bitterbark:
"The blocking pod that I'm working on has the capability of comparing not only multiple model cases but also multiple ensembles, showing the mean and the spread. Since the MDTF does not currently have that capability, I plan to implement it in a variation that only reads one model case. However, I would like to leave the other capabilities available for when the MDTF can do it. Can you tell me if you have a design in mind for including more cases, and/or ensembles, so I can move in that direction?"
Beta Was this translation helpful? Give feedback.
All reactions