-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OKR O.3.2.2: A flexible diagnostic module for ClimaAtmos #2043
Comments
This looks good. Thank you. A couple comments/requests:
|
We will add the |
You can use the If you want to add the CMIP names anyway, why not doing it right away, as you are adding the variables? It will make comparisons with other models, which should start soon, easier. |
Based on an offline discussion, here is what we are going to do for names: I will update the SDI. |
Based on (another) offline discussion, we will use CMIP names for both |
#2064 contains a fully working implementation of the infrastructure underpinning this SDI. The diagnostic variables and the defaults are not yet populated. I am leaving some comments here on rough edges that will likely not be fixed at this point because they depend on other work:
|
#2064 is being merged, implementing the majority of this SDI. My next step is to work on the remapping, so that we can produce lat-long-z files for generic configurations.
|
2064: Add diagnostic module r=Sbozzolo a=Sbozzolo This PR adds a new diagnostic module that roughly follows what described in #2043. Co-authored-by: Gabriele Bozzola <gbozzola@caltech.edu> Co-authored-by: LenkaNovak <lenka@caltech.edu> Co-authored-by: Zhaoyi Shen <11598433+szy21@users.noreply.github.com>
The above-mentioned distributed remapping is being implemented in CliMA/ClimaCore.jl#1475 |
#2179 implements the |
CliMA/ClimaCore.jl#1475 was meged. The two(/three) main outstanding items are:
|
The Climate Modeling Alliance
Software Design Issue 📜
Purpose
This SDI proposes to add a flexible module to compute arbitrary diagnostics from the simulation.
We want to be able to:
Our goals are:
ClimaAtmos
(e.g., adding a new reduction operation),Cost/Benefits/Risks
Diagnostics are currently hard-coded, so this is an important step towards a general and usable
ClimaAtmos
.A proof-of-concept implementation is already available, and some of the challenges have been addressed.
A possible performance problem with the design outlined below is that diagnostics cannot trivially use information computed in other diagnostics.
People and Personnel
This design was discussed with @simonbyrne
Components
Diagnostics are implemented as callbacks in the integrator. At fixed intervals of (integration) time, the diagnostics are computed from the state are output to disk. In an nutshell, this SDI discusses a module to produce a list of callbacks.
We will initially focus on point-wise operations and HDF5 files. Online remapping and producing NetCDF files can be implemented as a different
output_writer
(see below) and will be tackled after the main infrastructure is put in place.The low-level details
(Snippets of code below are to be considered pseudo-code.)
DiagnosticVariable
We represent a diagnostic variable as
struct
s that look like (roughly followingClimateMachine
)Fundamentally, a
DiagnosticVariable
is a recipe on how to compute a given diagnostic variable. Arguably, most of thisstruct
is not really needed. The key field iscompute_from_integrator
, which provides the recipe on how to obtain the value of the diagnostic variable from the integrator.long_name
,units
, anddescription
are provided for documentation. In the future, we can put in place a simple script to produce a table to add to the documentation to list what diagnostics can be computed (as in https://clima.github.io/ClimateMachine.jl/latest/DevDocs/DiagnosticVariableList/). We add these fields also to encourage good practices in documenting the diagnostics variables.The
short_name
is primarily the variable name in the output files, and is unique. Thelong_name
is a descriptive name. We will follow the CMIP table wherever available. We don't need astandard_name
as we are already following CMIP for short names.compute_from_integrator
has to be a function that takes two arguments: theintegrator
object, and an optional pre-allocatedoutput
space. Ifoutput
is not nothing, the diagnostic is computed in-place, otherwise new memory is allocated. An example ofcompute_from_integrator
to compute air temperature might look likeSupporting this syntax (with the optional
out
) requires adding a new method toClimaCore
.The
DiagnosticVariable
struct
is also a (optional) public interface. Users/developers that want to add more diagnostic variables can define their own.ClimaAtmos
will provide a collection ofDiagnosticVariable
s in a dictionaryall_diagnostics
. Developers can make new diagnostics available by adding newDiagnosticVariable
s. The integrator contains the atmospheric model, so developers can dispatch model-specific calculations on that.ScheduledDiagnostic
DiagnosticVariable
s are the ones that we know how to compute. The ones we are actually computing in a given simulation are described byScheduledDiagnostic
objects, which areA
ScheduledDiagnostic
is a variable that is computed and output. Thestruct
contains:variable
we want to compute (and internally, this gives us information about how to compute the diagnostic and its name)reduction_time_func
. When a function is passed toreduction_time_func
, we allocate an accumulator for this specificScheduledDiagnostic
and we repeatedly applyreduction_time_func
at the end of every iteration until we reachperiod
. Ifreduction_time_func
isnothing
, no time reduction is performed, instead, thevariable
is output as is everyperiod_iterations
iterations.reduction_space_func
. This will be called before writing the diagnostic. This will not implemented at this stage and only point-wise diagnostics are considered.output_writer
is expected to take three arguments: the value that has to be written, theDiagnosticVariable
, and theintegrator
.Details about this
struct
might change as more complexity is added (e.g., we might want to add askip_initial
field).We will provide factories to produce
output_writer
s for standard use-cases. For example, to write to HDF5 files given their path. Having a richoutput_writer
function allows us to support complex behaviors (such as creating new files, or appending to existing, or all sorts of combinations).We work with iterations instead of because it is well-defined and unambiguous. We will provide a second constructor for
ScheduledDiagnostic
that is more intuitive and that enforces constraints. For example,We allow multiple
ScheduledDiagnostic
s for a givenDiagnosticVariable
(for example, if we want to have mean daily and yearly temperature).Note that this is also a (optional) public interface. Users/developers that want to add/change more diagnostic can define their own.
To run a simulation, we collect all the
ScheduledDiagnostic
s we want to run into aDiagnosticTable
(which is just an iterable-- we will not define a new type for this). Upon initialization of the simulation, theDiagnosticTable
is parsed to pre-allocate all the accumulators and counters and prepare all the callbacks that are going to compute and output the diagnostics.A technical note here is that we will have to restrict the space of allowed reductions in time to the subset of operations for which we know the identity of the group (e.g., for the arithmetic average of numbers, the value
0
is the identity of the group; we will have to hard-code this).The higher level interfaces
The interface described in the previous section are available to be used by users and developers, but are too detailed for running most simulations (we still expect developers to use it to extend
ClimaAtmos
). Therefore, we will also provide higher level functions for common operations and model-depended defaults.get_default_diagnostics(AtmosModel)
will return the list of default diagnostics for the various components inAtmosModel
. This will be done by recursively asking for defaults to the various submodules, so that users can obtain the defaults of only specific submodules if they want to. With this function, we expect that only one line of code will be needed for users that want to output the default diagnostics for their specific model.Examples of other convenience functions that we will provide are (given a
DiagnosticTable
):add_diagnostics!(disgnostic_table::List, variables::List[DiagnosticVariable], output_file::String)
add_daily_averages!(disgnostic_table::List, variables::List[DiagnosticVariable], output_file::String)
add_precipitation_diagnostics!(disgnostic_table::List, variable_names::List[String], output_file::String)
This interface can be used in a script. Alternatively, users can specify the diagnostics they want to have in a YAML file that looks like:
Internally, this is parsed and evaluated with the constructor for
ScheduledDiagnostic
, and then aDiagnosticTable
is compiled, so that we are brought back to the low-level case.Results and Deliverables
The diagnostic module is an important part of
ClimaAtmos
. We will target outstanding levels of documentation. We will verify that the overhead due to the abstractions we put in place does not degrade performance significantly with respect to the main integration loop.Task Breakdown And Schedule
We have a first proof-of-concept implementation: https://github.com/CliMA/ClimaAtmos.jl/tree/gb/diagnostics
This first implementation does already everything we want, but with several hard-coded values and some workarounds. This implementation is currently being built on top of an experimental interface that bypasses the driver (see upcoming SDI). We will add the changes to the driver once the implementation stabilizes.
Rough timeline:
SDI Revision Log
CC
@tapios @simonbyrne @cmbengue
EDITS: Typos
The text was updated successfully, but these errors were encountered: