Multiple model cases/ensembles #148

tsjackson-noaa · 2020-10-07T16:13:35Z

tsjackson-noaa
Oct 7, 2020

From @bitterbark:
"The blocking pod that I'm working on has the capability of comparing not only multiple model cases but also multiple ensembles, showing the mean and the spread. Since the MDTF does not currently have that capability, I plan to implement it in a variation that only reads one model case. However, I would like to leave the other capabilities available for when the MDTF can do it. Can you tell me if you have a design in mind for including more cases, and/or ensembles, so I can move in that direction?"

tsjackson-noaa · 2020-10-07T16:30:45Z

tsjackson-noaa
Oct 7, 2020
Author

Hi @bitterbark ,
This is definitely something that should be on our feature roadmap, as you've brought it up previously in calls. At this point I can't commit to anything, but I think the most natural way to accommodate ensemble data would be to extend the framework's naming convention for local data: something like <casename>.<frequency>.<member number>.nc. When your POD looks for data files, if it doesn't find the ensemble member number in the filename, it could treat the data as an ensemble with one member and use the same code path.

Considering multiple model cases is a thornier issue, as that breaks the framework's current design assumptions: namely, that the framework is essentially called separately for each case (since we need to do data search and preprocessing for each case before the PODs are run) and each run of a POD is independent (they don't persist data between invocations.)

0 replies

bitterbark · 2020-10-07T18:31:33Z

bitterbark
Oct 7, 2020
Collaborator

Great idea on the optional ensemble member numbers. That should be easy for me to implement.

As for model cases, the incoming AMOC POD has an easy way of dealing with it. It runs independently over each model case into a unique working directory, and then checks if there are other cases availble to plot. If so, it produces a 'summary' html file in addition to the one for the individual cases.http://www.cgd.ucar.edu/cms/bundy/Projects/diagnostics/mdtf/mdtf_figures/AMOC/MDTF_GFDL-CM2p1/AMOC_3D_Structure/AMOC_3D_Structure_Summary.html
(Link fixed on edit)

0 replies

andrewgettelman · 2020-10-08T16:53:13Z

andrewgettelman
Oct 8, 2020
Collaborator

I don't think we should specify ensemble members. I think it should be more flexible. Dani and I were talking, and I think we should just have the case names as an array, not just a single value. PODs that are built that take a single value would just get array element zero, while the others can get the complete array.

This is how we have done things in the past, and I think is an easy path forward.

Then it could be an ensemble or different cases. There may also be other ways to specify ensemble members (Dani and I discussed this).

We also could use this as a path forward to allow existing pods to loop over these names, produce plots and arrange them on a multiple web page.

Thoughts?

0 replies

bitterbark · 2020-10-08T19:57:17Z

bitterbark
Oct 8, 2020
Collaborator

Regarding ensembles, assuming we have a list of multiple cases, we could have an optional attribute ensemble_name = string. Any pod that handles ensembles can check this, and group cases together accordingly. Any other pod can just treat them as individual cases. I can implement this in Rich Neale's blocking pod as an example of how to do it (with the group's approval on methodology).

0 replies

tsjackson-noaa · 2021-03-22T23:18:46Z

tsjackson-noaa
Mar 22, 2021
Author

Comments from J. David Neelin:

Path to multiple runs

E.g., a handful of simulations with candidate model updates — simplest case: run the pod once for each and compare each to data, plotting the same figures as would be done for a single model run. Advantage: can be handled at the framework level. May require restructuring default_tests.jsonc, etc. e.g. allow for multiple case names.
But this may be suboptimal for comparing them to each other— permit pods to have functionality that compares them directly and/or plots more than one candidate version on the same plot. E.g. Eric Maloney’s pod allows plotting results from different models on the same figure. POD can do this if framework informs it there are multiple case names. May require a new directory hierarchy for framework output.
This can also be useful for parameter perturbation runs. Xianan Jiang could work on an example if this is reasonably straightforward. This is something we mentioned in the current phase proposal, but have not yet done. Simplest case of this: treat each parameter perturbation run as a separate model version. More complex cases — need to know the parameter perturbation for relations among the runs.
Do something simple now that would be forward consistent to proposed work in Phase 3?:

“software framework expansion for Model Output types is illustrated with initial-condition ensembles in Figure 2 ... As warranted by needs from centers and POD developers, this could also include parameter-perturbation ensembles or anthropogenic scenario runs, with conventions compatible with those used for the CMIP archive, with the associated meta-information available to the PODs. The leading emphasis remains process-oriented comparison of model simulations to observations, but experiments with a limited set of parameter perturbations are common in model development to test hypothesize physical mechanisms.Similarly, some Type I teams are likely to examine emergent constraints, ...”

0 replies

tsjackson-noaa · 2021-03-23T01:07:22Z

tsjackson-noaa
Mar 23, 2021
Author

I'm advancing the following proposal as a starting point for further discussion, since I think it offers a concrete way to implement the desired functionality.

To provide context: recall that the flow of data in the package is organized in a "fan-out, then fan-in" (or map-reduce) structure: after the framework does initial setup, execution splits off into the individual PODs, which all run in parallel. When they finish, execution returns to the single framework process, which collects the output.
The first part of this proposal is to repeat this structure at the level of experimental runs as well: the user would provide a list of runs, and the framework would (do setup/run PODs/collect output) for each run in parallel. This is already half-implemented (the "case_list" of model data specifications is processed as a list; we just currently only use its first element); the new element would be adding a "fan-in" stage at the end where output is collected from the different runs.
The second part of this proposal involves changing how the framework deals with the PODs' output. Currently this is a free-form black box: the POD supplies plots and a web page template, and as long as all of the links work the framework is happy. In order to compare results across runs, I think it's necessary to require more structure:
1. The basic unit of POD output would now be the plot, not the HTML page as a whole.
2. We would restrict the POD's HTML output to take the form of a 2-d table where the "rows" are the different plots/analyses done by the POD, and the "columns" are different model runs (potentially including the experimental/reference data). I'm proposing this because the HTML pages for all current PODs follow this pattern, except for MJO_teleconnection (ESM4 sample output, for reference).
3. PODs could still put arbitrary introductory and reference information into the "header" or "footer" of this table, but all the POD's output that depends on the model input would have to fit the one-column paradigm.
4. This would give the framework more responsibility in processing the HTML template, which would now take place at the "fan-in" step after all experiments in the case_list have been analyzed: it would add a column to the table for each experiment that was run.
5. As a side benefit, moving more of the responsibility for HTML generation into the framework would let us use web 2.0 controls such as thumbnail galleries or carousels to make comparing plots easier for the user (rather than the current method, where each plot is opened as an image file in a new tab).
This doesn't address the important use case of doing calculations that compare POD results across different model runs (for example, computing the average of a POD's analysis across all members of an ensemble). To accomplish this, I propose splitting POD code into two pieces: one which does all comparison of a single model with obs data, and a second which does all the comparisons between different models. This is necessary because the two pieces would execute at different times (see flowchart below), with the input to the second piece being netCDF files of intermediate results saved from each run of the first piece (on each of the runs being analyzed).
This is a bit clunky, but the other ways I've come up with are clunkier. In particular, I think we need to avoid any scheme that involves persisting POD output across different runs of the package itself: that would make developing, testing and debugging much harder.

Below is an ASCII flowchart describing how execution of two PODS (A and B) on two model runs (1 and 2) would look under this proposal. Every task (piece of text) can execute independently of all others, provided its upstream dependencies have finished.

          ┌────► run ─────┐
          │     POD A     ▼
         init    on 1   output
  ┌───► case 1          case 1 ─────┐    ┌───► POD A' ──────┐
  │       │      run      ▲         ▼    │    compares      ▼
  │       └───► POD B ────┘         collect   cases 1,2   Final assembly
Fmwk             on 1                case                 of HTML output
init                                outputs    POD B'       ▲
  │       ┌────► run ─────┐         ▲    │    compares      │
  │       │     POD A     ▼         │    └───►cases 1,2─────┘
  │      init    on 2   output      │
  └───► case 2          case 2 ─────┘
          │      run      ▲
          └───► POD B ────┘
                 on 2

This is the simplest way I can come up with to accomplish everything we want to do with regards to multiple runs, but as you can see it still adds a lot of complexity to the package's execution. For this reason, I'd advocate first adapting the package to use a third-party embeddable data pipeline tool, such as luigi -- see related remarks here.

3 replies

jkrasting Mar 23, 2021
Maintainer

@tsjackson-noaa - cool stuff. I think that this type of workflow could be relevant for PODs that produce a product that can be used later. e.g. POD A produces a list of cyclone tracks and POD A' plots them. Another example is the radiative kernel POD we discussed a few weeks ago.

andrewgettelman Mar 24, 2021
Collaborator

I like @tsjackson-noaa 's concept of two streams or methods for the PODs working. Good to think of this as two tracks: one with pods which can handle multiple runs: they can use raw output, or the processed output that @jkrasting mentions above, and then internally loop over runs to produce plots.

For single run v. obs plots, it seems like wrapping them (or modifying their MDTF interface) to collect plots is a good way to go. As @tsjackson-noaa notes, this might require the framework taking over the web pages for these pods. That seems like a bit of work, so we should be careful to try to make this work well.

jkrasting Mar 24, 2021
Maintainer

We definitely need to consider what role the framework will play in the webpages going forward, @andrewgettelman

tsjackson-noaa · 2021-03-29T18:41:45Z

tsjackson-noaa
Mar 29, 2021
Author

Comments made during the mini-leads meeting on 29 March:

Need to be able to designate one (or multiple?) runs as "baseline" or "reference" runs, e.g. for the parameter perturbation or different warming scenario use cases.
- The simplest way to do this would be to let the user attach a dict of arbitrary metadata to each model run when defining the caselist, and leave the interpretation of that metadata up to the individual PODs. We could designate particular keywords for the use cases mentioned above, to ensure that different PODs do compatible things.
"Legacy mode" for existing PODs: want to be able to run existing PODs without needing to modify their code.
- Could simply run the POD on the first model run in the caselist.
- More usefully, could run the POD on every model run and put an entry for each in the output index.html file.
- Will need a flag (and perhaps other settings) in the PODs' settings.jsonc to designate that the POD supports multiple runs (non-legacy mode), but can make the absence of that flag mean that the POD is legacy-only.
Should we require that all PODs be able to run on a single experimental run, i.e. require that all multi-run functionality be strictly optional?
- Compatible with the use cases we've thought up so far.

0 replies

andrewgettelman · 2021-03-29T20:56:51Z

andrewgettelman
Mar 29, 2021
Collaborator

Sorry I missed the call today. One other thing we have done is that the 'baseline' or reference run could also be labeled 'observations' and that holds a collection of observations for differencing (we did this for example with ERA-Interim reanalysis data).

0 replies

andrewgettelman · 2021-10-13T17:37:54Z

andrewgettelman
Oct 13, 2021
Collaborator

Discussion at MinLeads Meeting, Oct 13:

Allison Wing

storing intermediate output seems like a way to help support this capability
in my POD (not in the framework yet), I run the main analysis over each model run separately, and then the different runs are brought together in a secondary analysis & plotting stage (i.e., multi-model means and anomalies from them, putting all the models on one plot rather than separate plots). Intermediate output files are created from the first main analysis stage that then are used in the second stage for combined analysis and plotting

Paul Ullrich:

There's a hierarchy here: (1) PODs that only compare model to obs, (2) PODs that can run datasets in an embarrassingly parallel manner, (3) PODs that internally manage delegation

John Krasting

I support getting requirement from POD developers, but that means we really need them to get cracking early on their PODs. We can't wait until the last year of the phase for them to complete their PODs and say "oh, by the way, we need the framework to do X, Y, and Z ..."

Paul Ullrich
or the POD needs to provide a flag indicating that they don't support comparison, in which case MDTF would loop over all cases and then tabulate the outputs

John Krasting
Can I pitch an idea ....Suggest a test case with super simple diagnostic: CMIP6 for one variable (Surface temperature). Multi-model AND CMIP6

Andrew Gettelman:
John's idea is a good one (general support for it)

1 reply

wrongkindofdoctor Oct 13, 2021
Maintainer

@ahmedfiaz also noted that we will likely need to parallelize the analyses in some fashion as some of the existing PODs are already computationally-expensive to run.

I will work with the task force in the coming months to gather requirements and draft program designs that will best meet the needs for multi-model/ensemble-type analyses.

andrewgettelman · 2021-10-13T20:36:18Z

andrewgettelman
Oct 13, 2021
Collaborator

Thanks @wrongkindofdoctor. Happy to help with design: I agree with @jkrasting 's idea about a simple test case.

The test case should not be convolved right now with more complex ideas for existing PODs, let's start with getting the framework to do something very (even trivially) simple. The CMIP6 example is a good one: just plot surface temperature (could even be zonal means: all on one plot) from a few CMIP6 models and don't break anything else. My 2 cents.

1 reply

dneelin Oct 13, 2021

Thanks @wrongkindofdoctor, noting a couple of extra things from today's discussion:
-Fiaz' summary of how his POD runs on CMIP6 is roughly consistent with Alison's above and he's up for helping with design;
-on the task force telecon we will aim to identify a couple of members who would be willing to interact in thinking through likely cases/requirements;
-David + Andrew reminded us of the extra data for PPE experiments and how to include; noting a comment from Tom March 29 above on a suggested solution

dneelin · 2021-11-10T16:55:35Z

dneelin
Nov 10, 2021

Summarizing a few items from above:
From Tom March 29, Regarding need to designate a baseline for reference run

attach a dict of arbitrary metadata to each model run when defining the caselist, and leave the interpretation of that metadata up to the individual PODs. We could designate particular keywords for the use cases
Interpretation up to the pod seems important, PODs may do different things for a warming run, e.g., comparison to cc scaling, that they would not do for differencing from a reference run or obs

Example from Allison, consistent with Fiaz' POD running on CMIP6

in my POD (not in the framework yet), I run the main analysis over each model run separately, and then the different runs are brought together in a secondary analysis & plotting stage (i.e., multi-model means and anomalies from them, putting all the models on one plot rather than separate plots). Intermediate output files are created from the first main analysis stage that then are used in the second stage for combined analysis and plotting
Similar to Tom's flowchart March 22 *except that the control happens inside the POD

From Paul mini-Leads October 13, hierarchy:

(1) PODs that only compare model to obs, (2) PODs that can run datasets in an embarrassingly parallel manner, (3) PODs that internally manage delegation

0 replies

aradhakrishnanGFDL · 2021-12-06T16:51:00Z

aradhakrishnanGFDL
Dec 6, 2021
Maintainer

For info, based on the group input here and in meetings--

Here is the model we will work on for the prototype. Feedback will be requested after basic prototyping is complete.

Model 2 provides capability for the MDTF framework to pass along input specifications that may span more than one input source (e.g. multiple ensemble members of an experiment) to a POD directly from the framework. In this case, a POD internally handles the processing of diagnostics from multiple experiments in it’s own fashion (serial or parallel). The pod handles the compilation and generation of output to a web page. At the end of the results generation phase, MDTF framework presents the final webpage with links to each pod’s individual webpage.

The input settings will use the existing structure with the addition of new case descriptions in the case_list
Output variables will be a nested dictionary, with the first case maintaining backward compatible environment variables.

More info and other use-cases to be discussed after the prototype.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple model cases/ensembles #148

{{title}}

Replies: 12 comments 5 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Multiple model cases/ensembles #148

tsjackson-noaa Oct 7, 2020

Replies: 12 comments · 5 replies

tsjackson-noaa Oct 7, 2020 Author

bitterbark Oct 7, 2020 Collaborator

andrewgettelman Oct 8, 2020 Collaborator

bitterbark Oct 8, 2020 Collaborator

tsjackson-noaa Mar 22, 2021 Author

tsjackson-noaa Mar 23, 2021 Author

jkrasting Mar 23, 2021 Maintainer

andrewgettelman Mar 24, 2021 Collaborator

jkrasting Mar 24, 2021 Maintainer

tsjackson-noaa Mar 29, 2021 Author

andrewgettelman Mar 29, 2021 Collaborator

andrewgettelman Oct 13, 2021 Collaborator

wrongkindofdoctor Oct 13, 2021 Maintainer

andrewgettelman Oct 13, 2021 Collaborator

dneelin Oct 13, 2021

dneelin Nov 10, 2021

aradhakrishnanGFDL Dec 6, 2021 Maintainer

tsjackson-noaa
Oct 7, 2020

Replies: 12 comments 5 replies

tsjackson-noaa
Oct 7, 2020
Author

bitterbark
Oct 7, 2020
Collaborator

andrewgettelman
Oct 8, 2020
Collaborator

bitterbark
Oct 8, 2020
Collaborator

tsjackson-noaa
Mar 22, 2021
Author

tsjackson-noaa
Mar 23, 2021
Author

jkrasting Mar 23, 2021
Maintainer

andrewgettelman Mar 24, 2021
Collaborator

jkrasting Mar 24, 2021
Maintainer

tsjackson-noaa
Mar 29, 2021
Author

andrewgettelman
Mar 29, 2021
Collaborator

andrewgettelman
Oct 13, 2021
Collaborator

wrongkindofdoctor Oct 13, 2021
Maintainer

andrewgettelman
Oct 13, 2021
Collaborator

dneelin
Nov 10, 2021

aradhakrishnanGFDL
Dec 6, 2021
Maintainer