O2.3.11 Add land/sea mask to ClimaCore domain #488

kmdeck · 2024-02-08T19:52:58Z

Purpose

For global runs, we currently set up a spherical shell domain (extruded FD) and create fields that are defined everywhere on the domain. The issues arise in three areas:

Interpolation: land parameters are defined over land only. When we interpolate these to the model grid, we need to be aware of the mask so as not to get unphysical land parameters over the ocean. For example, the van genuchten n parameter must be > 1. If over the ocean, this is zero, a cell with a large fraction of ocean will have an unphysical value. This same type of issue arises when we interpolate/remap the state to a lat/lon grid for plotting.
Solving the equations over the ocean: we need to set up initial conditions in regions where there is no land that wont break anything, and we evaluate tendencies over fake land in the ocean, which is inefficient, and these cant break anything
Other: there are sometimes multiple masks (coming from different native resolutions of parameter files, e.g.).

A related issue is that ultimately we would like to support lateral flow in the soil. This requires boundary conditions in the horizontal.

The benefits to what we are doing now is that (I think) regridding in the coupler is easy because land and atmos have the exact same space at the surface

Priorities

In the short term (i.e. done by this fall, for our paper), treating the land as a collection of independent columns is fine. We would not evaluate the model over ocean and not have to worry about setting values of parameters/IC over the ocean. That is, it would be nice to have a custom ClimaCore domain which is a set of independent columns which does not span the entire globe. Then we alleviate basically all of the issues above.
In the long term (end of next year?) supporting boundary conditions and lateral flow is a goal, but this is not essential and requires a lot more work. Note that there are two levels to this:
- regional simulations (hybrid box - no mask required, but boundary conditions in the horizontal are required). Here we can learn from Parflow, e.g.,
- global simulations (mask + horizontal BC) - can we learn from the ocean model/do what they do?

Proposed Work Summer 2024

ClimaLand:

short term: (what is in place now as of Aug 2024): set ocean values to mean over land, Ignore NaNs over the ocean that result as a function of the simulation. (on CPU, we might get breaking errors, while on GPU, running with NaNs does not break the sim)

ClimaCore:

Define a new type of domain which is a set of independent columns not defined over the entire globe
no DSS should be required and this should make the land model more efficient
Interpolation to the model grid needs to be mask aware
How would the flux exchange need to adapt if the land model had a very different grid?

Proposed Work 2025

Enable boundary conditions for the horizontal using a mask. These BC would be applied where the mask transitions from 1->0 (land -> ocean). The prognostic variable fields would still be defined globally.

Notes from a while ago - need to revisit
When we have lateral flow we want to make use of the current topology/connectivity of points and how this is set up in ClimaCore. This is because it is easier to use a mask in boundary condition setting in the horizontal than to create a totally unstructured mesh.

I checked with Andre and he confirmed that this is what the ocean model does. there are prognostic variables over land but the tendencies are never evaluated. Only indices which are in the mask = 1 region are loaded onto the GPU.

on the ClimaLand side:
This solution means that we need to use the mask on the land side because we will still have fields defined over the ocean.

We will need to:

ingest the land/sea mask and store somewhere where it is accessible to the tendency
initial state vector = 0 everywhere or we can set to NaN everywhere (?).
Set initial conditions using the mask (NaN over ocean)
Parameter data should be masked already (NaN over the ocean). When regridding, we'll get back Nans in the ocean for the field. Interpolating to model grid needs to be mask aware
Write a macro which can be used in any broadcasted function when mask = 1, like:

#=
This assumes that `mask` is in local scope
=#
macro masked(expr)
    :(@. ifelse(mask == 0, 0, expr))
end

function tendency_func!(Y, p, t)
    (; mask) = p
    (; x, z) = p # fields
    (; ϕ, ψ) = Y # fields
   # @. ϕ = some_tendency(ψ, x, z) # current
  #  @. ϕ = ifelse(mask == 0, 0, some_tendency(ψ, x, z)) # what we want
    @masked ϕ = some_tendency(ψ, x, z) # cleaner way to implement
end

Repeat for all broadcasted expressions in the tendency, cache, etc.
use the mask in plotting (maybe not if NaN in ocean)

is memory an issue since we are stored fields everywhere, when land is only 30% of the Earth's surface? We should be computed limited on GPU and not memory limited (this is true for the ocean model).

@sriharshakandala @juliasloan25 @Sbozzolo

Tasks

Give feedback

Use ETOPO 2022 global relief model data to generate element-level land-sea mask
Add element mask to ClimaCore spaces
Block FD/SE/tendency operations for masked elements in ClimaCore
Options

The text was updated successfully, but these errors were encountered:

kmdeck · 2024-02-08T19:55:22Z

@charleskawczynski
no immediate rush to answer, but this is something we'd like to figure out in the next month or two

kmdeck · 2024-02-13T22:48:52Z

@juliasloan25 has found that this also affects any run (including coupled runs) using the Bucket model, because the land albedo is not defined in the Artic.

charleskawczynski · 2024-02-14T02:48:33Z

My first thought was to bite the bullet and develop a special ClimaCore MaskedField, but the more that I think of it, the more I’m thinking that it’s easier to handle tendency case-by-case.
For example:

@. field_tendency = ifelse(mask==0, field_tendency, foo(Y.c))

We might be able to write a macro to write this more compactly, but I think this pretty effectively reuses our existing infrastructure. Also, this should still work with foo(Y.c)) as a field or a function/expression. So, interpolations etc should still work. Using broadcast expressions is kind of important in terms of letting ClimaCore handle metric terms properly, so I think it’s important to not overlook this key feature.

I’m happy to chat more about this if I’ve missed something or if you think there may be more to the story.

kmdeck · 2024-02-14T17:29:12Z

Hi Charlie,

thanks for thinking about this! My concern with masking the tendency is that then we also have to mask the initial conditions or set IC over the ocean (ditto with all of our spatial parameter fields), and possibly use the mask in plotting (Im not sure the value of zero for parameters and the state will produce non-NaN tendencies, so we may not be able to set ocean-area values to zero and have it just work). it ends up getting kind of hacky.

we can discuss more! I can also show you our code and we can look at how complex it would be to implement the above^^

Our tendency functions have field vector arguments, so it didnt seem like they would broadcast as we want.

charleskawczynski · 2024-02-14T17:34:23Z

we can discuss more! I can also show you our code and we can look at how complex it would be to implement the above^^

I think that would be really helpful

kmdeck · 2024-02-23T19:43:23Z

@simonbyrne, Tapio also suggested asking your opinion on this. He had the impression from you that restricting the grid to only include continents, while maintaining information about connectivity (to support tendencies in the horizontal, and, eventually, non-periodic boundary conditions), was not that difficult.

If that is the case, it seems like the load-balancing problem would not arise, nor would the land model need to worry about about only evaluating tendencies over land, because our fields would not be defined over the ocean.

thank you!

simone-silvestri · 2024-02-26T15:38:48Z

Hi @kmdeck.

In Oceananigans we have a structured mesh, so we maintain the structure but launch an unstructured kernel, i.e. we (1) map out the active cells in a linear map with map[t] = (i, j, k), (2) launch a kernel with a number of threads equal to the total number of active cells and finally (3) map the unique thread index to the three-dimensional grid index inside the function.

In distributed computations, to load balance it is enough to calculate the total number of active cells in the global grid and divide them by processor. We do keep the memory associated with immersed cells though. In your case, it is probably better to go fully unstructured since you are using climacore that gives you this ability

kmdeck · 2024-02-27T17:34:30Z

@dennisYatunin, if we had NaNs as our field values over the ocean, and then tried to step Y.soil.variable implicitly, would we always use the max number of iterations over the ocean, since NaN < tolerance is always false?

dennisYatunin · 2024-02-27T20:12:57Z

Depends on how you have things set up. If you're using the convergence checker specified here, then yes. You can also pass a custom norm to the convergence checker, though. The default is LinearAlgebra.norm, but you can define an alternative version that masks every field before reducing it with the standard L2-norm.

You can also just hardcode the number of Newton iterations, and avoid using a convergence checker altogether. That's what we do in ClimaAtmos.

Sbozzolo · 2024-09-06T01:48:46Z

It would be good if we could also skip the evaluation of external forcings (ie, read from file) on masked values too.

This is not as critical because it probably won't affect performance that much and it not causing issues NaNs.

kmdeck added the SDI Software Design Issue label Feb 8, 2024

kmdeck self-assigned this Feb 8, 2024

kmdeck assigned charleskawczynski Feb 21, 2024

juliasloan25 changed the title ~~Handling land/sea masks~~ O2.3.11 Add land/sea mask to ClimaCore domain Apr 26, 2024

juliasloan25 mentioned this issue Apr 26, 2024

Land/Sea mask/ Loop over columns #48

Closed

kmdeck added this to the 2.3.11 Add land/sea mask milestone Apr 26, 2024

kmdeck assigned kmdeck, Sbozzolo, juliasloan25 and sriharshakandala and unassigned charleskawczynski and kmdeck Jun 5, 2024

juliasloan25 mentioned this issue Jun 6, 2024

O2.3.10 Improve performance of data handling software (data reading, regridding, etc), given quantity of reanalysis data CliMA/ClimaUtilities.jl#49

Closed

kmdeck assigned AlexisRenchon and braghiere Aug 28, 2024

Sbozzolo removed their assignment Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

O2.3.11 Add land/sea mask to ClimaCore domain #488

O2.3.11 Add land/sea mask to ClimaCore domain #488

kmdeck commented Feb 8, 2024 •

edited by sriharshakandala

Loading

Tasks

kmdeck commented Feb 8, 2024

kmdeck commented Feb 13, 2024

charleskawczynski commented Feb 14, 2024

kmdeck commented Feb 14, 2024

charleskawczynski commented Feb 14, 2024

kmdeck commented Feb 23, 2024

simone-silvestri commented Feb 26, 2024

kmdeck commented Feb 27, 2024

dennisYatunin commented Feb 27, 2024

Sbozzolo commented Sep 6, 2024

O2.3.11 Add land/sea mask to ClimaCore domain #488

O2.3.11 Add land/sea mask to ClimaCore domain #488

Comments

kmdeck commented Feb 8, 2024 • edited by sriharshakandala Loading

Purpose

Priorities

Proposed Work Summer 2024

Proposed Work 2025

Tasks

kmdeck commented Feb 8, 2024

kmdeck commented Feb 13, 2024

charleskawczynski commented Feb 14, 2024

kmdeck commented Feb 14, 2024

charleskawczynski commented Feb 14, 2024

kmdeck commented Feb 23, 2024

simone-silvestri commented Feb 26, 2024

kmdeck commented Feb 27, 2024

dennisYatunin commented Feb 27, 2024

Sbozzolo commented Sep 6, 2024

kmdeck commented Feb 8, 2024 •

edited by sriharshakandala

Loading