-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implicit Solver Interface Updates #1230
Comments
If we imagine a system with n prognostic variables, and denote the tendency due to the boundary term as Two questions
|
Can we:
|
This describes the current issues well and clearly. The description of the solution could use detail (though the draft PR helps). For us to be able to track progress, could you please add milestones expected completion dates (roughly weekly milestones)? It'll also be important for @sriharshakandala, @charleskawczynski, and @simonbyrne to review, especially with regard to GPU-friendliness of the proposed solutions. Wherever this refactoring lands, it'll be important to keep the current Schur complement formulation around (e.g., for convection resolving simulations). It may be nice to allow this as an option that lives within this refactoring, rather than outside of it. |
Also, please do not leave the simple array-based implementation test as an after thought-- this will be helpful to evaluate the design before it's fully implemented, and it will be too late to help reviewers judge the quality of the design if it's the very last thing that is added. cc @dennisYatunin |
@kmdeck @juliasloan25 After discussing the matter with @simonbyrne, it looks like we will be able to set the derivatives of operator boundary conditions without adding any new operators (and without any assumptions like Here are some concrete examples:
|
@tapios @charleskawczynski I have made the following updates to the SDI based on your comments:
After a lengthy discussion with Simon and Sriharsha today, it has become clear that the "permuted band matrix solve" algorithm will require a fair bit of work to ensure that it is performant and GPU-compatible. However, all of the tasks that come before the implementation of this new algorithm (up to testing the interface in ClimaAtmos) should be good as-is. Per the revised time estimates, these tasks should take 3--4 weeks, which should be enough time for us to settle on a decent implementation of the new algorithm. If I start later this week, this means that the interface will be tested in ClimaAtmos by around this time in June. |
@simonbyrne @sriharshakandala Here is a list of all the matrix sparsity patterns that
For ClimaLSM, we only need to support the sparsity pattern of the matrix specified in the comment above, In the future, ClimaLSM will also include the prognostic variable For the ClimaAtmos dycore, the sparsity pattern of However, instead of using the exact value of For AMIP, we only need to support When we add EDMF to ClimaAtmos, we end up with a significantly more complicated Jacobian. We are not entirely certain of how sparse we can afford to make our approximation of Since we plan to approximate the bottom-left matrix block as 0, our strategy for solving the linear problem This means that our implicit solver will first compute changes at the sub-grid-scale ( In the near future, we only plan to use EDMF with |
I appreciate your thoroughness, @dennisYatunin! One important simplification, though: In the dycore, we can treat all tracers fully explicitly, so T=0 is fine, or, if you want to include moisture, T=1 (but even that is not strictly necessary). We only need to treat the fast waves (acoustic and gravity) implicitly, and tracers play no role in them. |
@tapios Thanks! We actually stopped making that assumption in ClimaAtmos sometime last year. I think the intent was to ensure that all forms of moisture get treated roughly the same way as energy in the implicit solve, but neither Daniel nor Zhaoyi nor I remember how significant the resulting timestep increase was. Should we revert to making that approximation, or add a flag to toggle between the two options? In either case, the "Schur complement solve" scales very well with |
@dennisYatunin It's ok to treat the moisture tracers (q_t, q_l, q_i) implicitly and in the same way as energy. But no need to do the same for other tracers. |
@dennisYatunin : |
@sriharshakandala We are indeed storing the matrices in a traditional band storage format, except we are using "row-major" storage instead of "column-major" storage (the link you sent uses "column-major" storage). We are storing the matrices row-by-row in order to encode them as |
Thanks, @dennisYatunin. For the quaddiagonal and pentadiagonal matrices, I am assuming we are either losing them in the Jacobian matrix approximation or the bands are located contiguously! |
@tapios Time estimates have been updated! |
Thanks, @dennisYatunin. Looking good! |
1326: Add new MatrixFields module, along with unit tests and performance tests r=dennisYatunin a=dennisYatunin # Purpose This PR encompasses the first two tasks of the implicit interface updates [SDI](#1230). That is, it refactors `StencilCoefs` into `BandMatrixRow` and `ApplyStencil`/`ComposeStencil` into `MultiplyColumnwiseBandMatrixField`, and it also adds a comprehensive set of tests. In the process of refactoring, the code for low-level matrix operations has been simplified and useful new functionality has been added. In order to avoid conflicts with pre-existing code, the new data structures and functions have been placed in a separate module called `MatrixFields`; this module can be removed once we are ready to replace all of the old code. ## Content ### API improvements - Matrix field rows now have simple constructors (e.g., `DiagonalMatrixRow` and `TridiagonalMatrixRow`) that can be used to construct matrix fields on the fly in broadcast expressions. - Matrix fields are now automatically promoted to common types for basic arithmetic operations. This involves promoting the entries in the rows, as well as padding rows with zeros when rows have unequal numbers of entries. In addition, `LinearAlgebra.I` (or, rather, `LinearAlgebra.UniformScaling`) is automatically promoted to a `DiagonalMatrixRow` when used with `BandMatrixRow`s. So, for example, it is now possible to evaluate `DiagonalMartrixRow(1) - 2 * TridiagonalMatrixRow(3, 4, 5) - I / 6`. - Matrix fields are now fully integrated with `RecursiveApply`, which allows a single matrix field to store multiple matrices by using `Tuple`s or `NamedTuple`s as entries instead of scalars. This is achieved by using the new `rzero` function instead of `zero`, and also by making `+`, `-`, `*`, and `/` call `radd`, `rsub`, `rmul`, and `rdiv` when used on `BandMatrixRow`s. This change will make it possible to, e.g., store the derivatives of all tracer tendencies with respect to vertical velocity as a single matrix field in `ClimaAtmos`, which will lead to simpler code and better performance. - All matrix-matrix and matrix-vector multiplications can now be handled using the `⋅` operator, which is an alias for `MultiplyColumnwiseBandMatrixField()`. The implementation of this operator is much simpler than the implementation of `ApplyStencil`/`ComposeStencil`, and it is effectively localized to the single function `multiply_matrix_at_index`. In order to avoid inference failures for complex matrix field broadcast expressions, the recursion limit for this function needed to be disabled. Unfortunately, just as with the pre-existing code, inlining this function with ``@propagate_inbounds`` drastically increased compilation time, and it also caused inference failures for particularly complicated matrix broadcast expressions. Although not using ``@propagate_inbounds`` causes a 3--4x slowdown in the evaluation of matrix broadcast expressions, it avoids a 10--100x slowdown in their compilation. - The matrix of the divergence operator can now be properly represented as a matrix of covectors, rather than a matrix of scalars. Since the divergence operator turns vectors into scalars, multiplying its matrix by a matrix/vector of vectors should return a matrix/vector of scalars. However, this is not as simple as just multiplying a covector by a vector and returning a scalar, since the vector may need to be projected onto the dual axis of the covector before the multiplication. So, matrix-matrix and matrix-vector multiplication is now implemented using the `rmul_with_projection` function instead of `rmul`. In general, this function works just like `rmul`, but, when the first argument is a covector (an `AdjointAxisVector`) and the second is a vector or higher-order tensor (an `AxisTensor`), it uses the local geometry to project the second argument before the multiplication. In the future, we may want to generalize this to higher-order cotensors. - The `MatrixFields` module also includes functions for converting matrix fields to more conventional arrays that are compatible with `LinearAlgebra.jl`. The function `column_field2array` converts a matrix field defined on a column space to a `BandedMatrix` from `BandedMatrices.jl`, and it also converts a regular field defined on a column space to a `Vector`. The function `field2arrays` works for fields defined on multiple columns by calling `column_field2array` on each column. - Matrix fields now get displayed in the REPL as actual matrices. Specifically, when a matrix field contains `Number` entries, its first column is passed to `column_field2array_view`, which is identical to `column_field2array`, except that it allocates less memory by returning a view of the field's underlying data. Although `column_field2array` and `column_field2array_view` work for matrix fields with `Tuple`/`NamedTuple` entries, the `show` method for a `BandedMatrix` crashes when displaying such entries. In addition, this `show` method results in somewhat unreadable output for other struct entries, like vectors and covectors. So, all non-`Number` matrix fields still get displayed in the REPL like regular fields. ### Related ClimaCore Changes - To ensure that promotion and conversion of `BandMatrixRow` entries works correctly for nested `Tuple`/`NamedTuple` entries, the `rpromote_type` and `rconvert` functions have been added to RecursiveApply. The first function uses `rmaptype` to recursively call `promote_type`, and the second function uses `rmap` and `rzero` to make a more type-stable version of Julia's built-in `convert` function. - To support manipulating matrices of covectors (and, in the future, matrices of higher-order cotensors), all basic arithmetic operations have been defined for `AdjointAxisTensor`. This includes `+`, `-`, `*`, `/`, `\`, and `==`, as well as the `zero` function. The new methods undo the adjoint and fall back to the methods for `AxisTensor`. - To simplify the setup of the non-scalar matrix tests, `map` has been generalized from only working for a single `Field` to working for multiple `Field`s. ### Tests - All of the following test files check for correctness, allocations, and type instabilities. - The file `test/MatrixFields/band_matrix_row.jl` tests low-level operations with `BandMatrixRow`s, ensuring that constructors, conversions, arithmetic operations, and combinations with `LinearAlgebra.I` work for numbers and nested `Tuple`/`NamedTuple` values. - The file `test/MatrixFields/rmul_with_projection.jl` tests `rmul_with_projection` for all currently supported combinations of scalars, vectors, tensors, and covectors, as well as several combinations of nested `Tuple`/`NamedTuple` values. - The file `test/MatrixFields/field2array.jl` checks that `column_field2array`, `column_field2array_view`, and `field2arrays` work as expected for very simple matrix fields. - The file `test/MatrixFields/matrix_field_broadcasting.jl` tests a wide range of scalar and non-scalar matrix field broadcast expressions. It compares each scalar matrix field broadcast expression against an equivalent implementation using `field2arrays` and LinearAlgebra's `mul!`, making sure that the matrix field implementation is roughly as performant as the array implementation. It also compares each non-scalar matrix field broadcast expression (e.g., an expression with matrices of covectors or `NamedTuple`s) against an equivalent scalar matrix field implementation. - I have also checked that the new implementation is at least as performant as the pre-existing implementation for those cases that the pre-existing API can support. For example, to compare the two implementations for the "diagonal matrix times bi-diagonal matrix times tri-diagonal matrix times quad-diagonal matrix times vector" test case, run the following code after `test/MatrixFields/matrix_field_broadcasting.jl`: ``` using ClimaCore: MatrixFields, Operators function compare_to_old_implementation() ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec, ᶠvec = random_test_fields(Float64) result = `@.` ᶜᶜmat ⋅ ᶜᶠmat ⋅ ᶠᶠmat ⋅ ᶠᶜmat ⋅ ᶜvec inputs = (ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec) func!(result, ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec) = `@.` result = ᶜᶜmat ⋅ ᶜᶠmat ⋅ ᶠᶠmat ⋅ ᶠᶜmat ⋅ ᶜvec time, allocs = `@benchmark` call_func(func!, result, inputs) to_old_version(row::BMR) where {BMR <: MatrixFields.BandMatrixRow} = Operators.StencilCoefs{MatrixFields.outer_diagonals(BMR)...}(row.entries) apply = Operators.ApplyStencil() compose = Operators.ComposeStencils() ᶜᶜmat_old = to_old_version.(ᶜᶜmat) ᶜᶠmat_old = to_old_version.(ᶜᶠmat) ᶠᶠmat_old = to_old_version.(ᶠᶠmat) ᶠᶜmat_old = to_old_version.(ᶠᶜmat) result_old = `@.` apply(compose(compose(compose(ᶜᶜmat_old, ᶜᶠmat_old), ᶠᶠmat_old), ᶠᶜmat_old), ᶜvec) inputs_old = (ᶜᶜmat_old, ᶜᶠmat_old, ᶠᶠmat_old, ᶠᶜmat_old, ᶜvec) func!_old(result_old, ᶜᶜmat_old, ᶜᶠmat_old, ᶠᶠmat_old, ᶠᶜmat_old, ᶜvec) = `@.` result_old = apply(compose(compose(compose(ᶜᶜmat_old, ᶜᶠmat_old), ᶠᶠmat_old), ᶠᶜmat_old), ᶜvec) time_old, allocs_old = `@benchmark` call_func(func!_old, result_old, inputs_old) `@assert` allocs == allocs_old == 0 `@info` "New implementation time: $time s" `@info` "Old implementation time: $time_old s" end compare_to_old_implementation() ``` This code prints out the following: ``` [ Info: New implementation time: 1.5848e-5 s [ Info: Old implementation time: 1.5283e-5 s ``` Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1326: Add new MatrixFields module, along with unit tests and performance tests r=dennisYatunin a=dennisYatunin # Purpose This PR encompasses the first two tasks of the implicit interface updates [SDI](#1230). That is, it refactors `StencilCoefs` into `BandMatrixRow` and `ApplyStencil`/`ComposeStencil` into `MultiplyColumnwiseBandMatrixField`, and it also adds a comprehensive set of tests. In the process of refactoring, the code for low-level matrix operations has been simplified and useful new functionality has been added. In order to avoid conflicts with pre-existing code, the new data structures and functions have been placed in a separate module called `MatrixFields`; this module can be removed once we are ready to replace all of the old code. ## Content ### API improvements - Matrix field rows now have simple constructors (e.g., `DiagonalMatrixRow` and `TridiagonalMatrixRow`) that can be used to construct matrix fields on the fly in broadcast expressions. - Matrix fields are now automatically promoted to common types for basic arithmetic operations. This involves promoting the entries in the rows, as well as padding rows with zeros when rows have unequal numbers of entries. In addition, `LinearAlgebra.I` (or, rather, `LinearAlgebra.UniformScaling`) is automatically promoted to a `DiagonalMatrixRow` when used with `BandMatrixRow`s. So, for example, it is now possible to evaluate `DiagonalMartrixRow(1) - 2 * TridiagonalMatrixRow(3, 4, 5) - I / 6`. - Matrix fields are now fully integrated with `RecursiveApply`, which allows a single matrix field to store multiple matrices by using `Tuple`s or `NamedTuple`s as entries instead of scalars. This is achieved by using the new `rzero` function instead of `zero`, and also by making `+`, `-`, `*`, and `/` call `radd`, `rsub`, `rmul`, and `rdiv` when used on `BandMatrixRow`s. This change will make it possible to, e.g., store the derivatives of all tracer tendencies with respect to vertical velocity as a single matrix field in `ClimaAtmos`, which will lead to simpler code and better performance. - All matrix-matrix and matrix-vector multiplications can now be handled using the `⋅` operator, which is an alias for `MultiplyColumnwiseBandMatrixField()`. The implementation of this operator is much simpler than the implementation of `ApplyStencil`/`ComposeStencil`, and it is effectively localized to the single function `multiply_matrix_at_index`. In order to avoid inference failures for complex matrix field broadcast expressions, the recursion limit for this function needed to be disabled. Unfortunately, just as with the pre-existing code, inlining this function with ``@propagate_inbounds`` drastically increased compilation time, and it also caused inference failures for particularly complicated matrix broadcast expressions. Although not using ``@propagate_inbounds`` causes a 3--4x slowdown in the evaluation of matrix broadcast expressions, it avoids a 10--100x slowdown in their compilation. - The matrix of the divergence operator can now be properly represented as a matrix of covectors, rather than a matrix of scalars. Since the divergence operator turns vectors into scalars, multiplying its matrix by a matrix/vector of vectors should return a matrix/vector of scalars. However, this is not as simple as just multiplying a covector by a vector and returning a scalar, since the vector may need to be projected onto the dual axis of the covector before the multiplication. So, matrix-matrix and matrix-vector multiplication is now implemented using the `rmul_with_projection` function instead of `rmul`. In general, this function works just like `rmul`, but, when the first argument is a covector (an `AdjointAxisVector`) and the second is a vector or higher-order tensor (an `AxisTensor`), it uses the local geometry to project the second argument before the multiplication. In the future, we may want to generalize this to higher-order cotensors. - The `MatrixFields` module also includes functions for converting matrix fields to more conventional arrays that are compatible with `LinearAlgebra.jl`. The function `column_field2array` converts a matrix field defined on a column space to a `BandedMatrix` from `BandedMatrices.jl`, and it also converts a regular field defined on a column space to a `Vector`. The function `field2arrays` works for fields defined on multiple columns by calling `column_field2array` on each column. - Matrix fields now get displayed in the REPL as actual matrices. Specifically, when a matrix field contains `Number` entries, its first column is passed to `column_field2array_view`, which is identical to `column_field2array`, except that it allocates less memory by returning a view of the field's underlying data. Although `column_field2array` and `column_field2array_view` work for matrix fields with `Tuple`/`NamedTuple` entries, the `show` method for a `BandedMatrix` crashes when displaying such entries. In addition, this `show` method results in somewhat unreadable output for other struct entries, like vectors and covectors. So, all non-`Number` matrix fields still get displayed in the REPL like regular fields. ### Related ClimaCore Changes - To ensure that promotion and conversion of `BandMatrixRow` entries works correctly for nested `Tuple`/`NamedTuple` entries, the `rpromote_type` and `rconvert` functions have been added to RecursiveApply. The first function uses `rmaptype` to recursively call `promote_type`, and the second function uses `rmap` and `rzero` to make a more type-stable version of Julia's built-in `convert` function. - To support manipulating matrices of covectors (and, in the future, matrices of higher-order cotensors), all basic arithmetic operations have been defined for `AdjointAxisTensor`. This includes `+`, `-`, `*`, `/`, `\`, and `==`, as well as the `zero` function. The new methods undo the adjoint and fall back to the methods for `AxisTensor`. - To simplify the setup of the non-scalar matrix tests, `map` has been generalized from only working for a single `Field` to working for multiple `Field`s. ### Tests - All of the following test files check for correctness, allocations, and type instabilities. - The file `test/MatrixFields/band_matrix_row.jl` tests low-level operations with `BandMatrixRow`s, ensuring that constructors, conversions, arithmetic operations, and combinations with `LinearAlgebra.I` work for numbers and nested `Tuple`/`NamedTuple` values. - The file `test/MatrixFields/rmul_with_projection.jl` tests `rmul_with_projection` for all currently supported combinations of scalars, vectors, tensors, and covectors, as well as several combinations of nested `Tuple`/`NamedTuple` values. - The file `test/MatrixFields/field2array.jl` checks that `column_field2array`, `column_field2array_view`, and `field2arrays` work as expected for very simple matrix fields. - The file `test/MatrixFields/matrix_field_broadcasting.jl` tests a wide range of scalar and non-scalar matrix field broadcast expressions. It compares each scalar matrix field broadcast expression against an equivalent implementation using `field2arrays` and LinearAlgebra's `mul!`, making sure that the matrix field implementation is roughly as performant as the array implementation. It also compares each non-scalar matrix field broadcast expression (e.g., an expression with matrices of covectors or `NamedTuple`s) against an equivalent scalar matrix field implementation. - I have also checked that the new implementation is at least as performant as the pre-existing implementation for those cases that the pre-existing API can support. For example, to compare the two implementations for the "diagonal matrix times bi-diagonal matrix times tri-diagonal matrix times quad-diagonal matrix times vector" test case, run the following code after `test/MatrixFields/matrix_field_broadcasting.jl`: ``` using ClimaCore: MatrixFields, Operators function compare_to_old_implementation() ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec, ᶠvec = random_test_fields(Float64) result = `@.` ᶜᶜmat ⋅ ᶜᶠmat ⋅ ᶠᶠmat ⋅ ᶠᶜmat ⋅ ᶜvec inputs = (ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec) func!(result, ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec) = `@.` result = ᶜᶜmat ⋅ ᶜᶠmat ⋅ ᶠᶠmat ⋅ ᶠᶜmat ⋅ ᶜvec time, allocs = `@benchmark` call_func(func!, result, inputs) to_old_version(row::BMR) where {BMR <: MatrixFields.BandMatrixRow} = Operators.StencilCoefs{MatrixFields.outer_diagonals(BMR)...}(row.entries) apply = Operators.ApplyStencil() compose = Operators.ComposeStencils() ᶜᶜmat_old = to_old_version.(ᶜᶜmat) ᶜᶠmat_old = to_old_version.(ᶜᶠmat) ᶠᶠmat_old = to_old_version.(ᶠᶠmat) ᶠᶜmat_old = to_old_version.(ᶠᶜmat) result_old = `@.` apply(compose(compose(compose(ᶜᶜmat_old, ᶜᶠmat_old), ᶠᶠmat_old), ᶠᶜmat_old), ᶜvec) inputs_old = (ᶜᶜmat_old, ᶜᶠmat_old, ᶠᶠmat_old, ᶠᶜmat_old, ᶜvec) func!_old(result_old, ᶜᶜmat_old, ᶜᶠmat_old, ᶠᶠmat_old, ᶠᶜmat_old, ᶜvec) = `@.` result_old = apply(compose(compose(compose(ᶜᶜmat_old, ᶜᶠmat_old), ᶠᶠmat_old), ᶠᶜmat_old), ᶜvec) time_old, allocs_old = `@benchmark` call_func(func!_old, result_old, inputs_old) `@assert` allocs == allocs_old == 0 `@info` "New implementation time: $time s" `@info` "Old implementation time: $time_old s" end compare_to_old_implementation() ``` This code prints out the following: ``` [ Info: New implementation time: 1.5848e-5 s [ Info: Old implementation time: 1.5283e-5 s ``` Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1399: Add operator_matrix to MatrixFields, along with tests and docs r=dennisYatunin a=dennisYatunin ## Purpose Second PR of #1230. Refactors what is currently in `src/operators/operator2stencil.jl`. ## Content - Adds the function `operator_matrix(op)`, which will replace `Operator2Stencil(op)`. This function has a detailed docstring and throws descriptive errors for operators without well-defined operator matrices. - Improves the usability of operator matrices. With this new interface, users will no longer need to specify intermediate fields to compute operator matrices. For example, with our pre-existing code, the matrix of the interpolation operator (a bidiagonal matrix whose entries are `-1/2` and `+1/2`) needs to be computed with ``@.` matrix_field = interp_matrix(ones_field)`, where `ones_field` is a field filled with the number `1` that is used to infer the space and entry type of the matrix. Now, this matrix can just be computed with ``@.` matrix_field = interp_matrix()`. In order to make this work, `interp_matrix` is now defined as a "lazy operator". When a broadcast expression containing lazy operators is evaluated, each lazy operator is replaced with an actual operator, and it is given one or more fields as input arguments. In this case, `interp_matrix` is given the local geometry field as an input argument, and this field is used to infer the space and entry type of the operator matrix. - This usability improvement slightly changes the computation of derivative matrices. With our pre-existing code, ``@.` op(func(field))` is equivalent to ``@.` op_matrix(ones_field) ⋅ func(field)`, and the derivative of this expression with respect to `field` can be specified as ``@.` op_matrix(func'(field))`, where `func'` is the derivative of the point-wise function `func`. With this new interface, ``@.` op(func(field))` is equivalent to ``@.` op_matrix() ⋅ func(field)`, and the derivative can be specified as ``@.` op_matrix() ⋅ DiagonalMatrixRow(func'(field))`. Similarly, the derivative of ``@.` op2(func2(op1(func1(field))))` with respect to `field` is `op2_matrix(func2'(op1(func1(field)))) ⋅ op1_matrix(func1'(field))` with our pre-existing code and `op2_matrix() ⋅ DiagonalMatrixRow(func2'(op1(func1(field)))) ⋅ op1_matrix() ⋅ DiagonalMatrixRow(func1'(field))` with the new interface. Although the new interface leads to longer derivative expressions, those expressions are more similar to how the chain rule is usually written out, and they can be debugged/analyzed more incrementally. - Adds support for computing operator matrices of multi-argument operators. For example, if `op` is the `Upwind3rdOrderBiasedProductC2F` operator, then ``@.` op(velocity_field, tracer_field)` is equivalent to ``@.` op_matrix(velocity_field) ⋅ tracer_field`. The implementation is similar to that of single-argument operators, except that it does not require the use of "lazy operators" (since there is already a field being passed to the operator matrix, the local geometry field can be obtained from that field during the evaluation of `Base.Broadcast.broadcasted`). - Adds support for computing operator matrices of operators with `Extrapolate` boundary conditions. These boundary conditions cause the matrices to have larger bandwidths than other boundary conditions. - Tests the `operator_matrix` function with every valid combination of finite difference operators and boundary conditions. The tests check for correctness, type stability, and lack of allocations. The tests are run on both CPUs and GPUs. In addition, the tests print out how the performance of ``@.` op_matrix() ⋅ field` compares to the performance of ``@.` op(field)`; the two expressions are similarly fast on GPUs (between `-70%` and `+40%` relative change in speed), though the operator matrix expressions tend to be slower on CPUs. - Tests a few more complicated broadcast expressions involving products and linear combinations of operator matrices. These tests indicate that operator matrices are similarly performant to regular matrix fields. - Modifies `test_field_broadcast` so that it also tests whether `get_result` generates the same result as `set_result!`. - Modifies the `*` method for `BandMatrixRow` so that matrix fields can be scaled by vectors/covectors in addition to numbers. This simplifies a few of the complicated broadcast tests. - Fixes a typo (`Geometery`) in the method of `stencil_left_boundary` for `GradientF2C`. - Adds a missing method to `rpromote_type` that was preventing empty matrix rows from being constructed. - Modifies the `Base.Broadcast.broadcasted` method for `FiniteDifferenceOperator` and `SpectralElementOperator` so that lazy operators can work correctly (the original versions of these methods would always overwrite the `LazyOperatorStyle` with `StencilStyle` or `SpectralStyle`, respectively). - Unfortunately, this PR also adds 2 method invalidations. These are due to a new definition of `broadcasted` for lazy operators: ``` Base.Broadcast.broadcasted(::LazyOperatorStyle, f::F, args...) where {F} = LazyOperatorBroadcasted(f, args) ``` As explained in an accompanying code comment, removing this method requires modifying several other method definitions, and one of these modifications adds 11 invalidations. So, there doesn't seem to be a good way to avoid these 2 invalidations. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1436: Add FieldMatrix and linear solvers r=dennisYatunin a=dennisYatunin ## Purpose Third PR of #1230. Adds an interface for specifying block matrices of matrix fields and solving linear systems with these matrices. This will replace what is currently in `ClimaAtmos.jl/src/prognostic_equations/implicit/schur_complement_W.jl`, generalizing it for implicit diffusion and implicit EDMF. ## Content ### Main Changes - Add the `FieldName` struct, which is a singleton type that represents a chain of `getproperty` calls - Add the ``@name`` macro for constructing `FieldName`s, which checks whether its input expression is a syntactically valid chain of `getproperty` calls before calling the default constructor. - A `name` can be used to access a property or sub-property of an object `x` by calling `get_field(x, name)`. - An `internal_name` can be appended onto another `name` in order to access a property or sub-property of `get_field(x, name)`. - Add the `FieldNameDict` struct, which maps each key in a set of `FieldVectorKeys` or `FieldMatrixKeys` (see below) to a `Field` or some other object. - There are currently four subtypes of `FieldNameDict`: - `FieldMatrix` (the only user-facing subtype), which maps `NTuple{2, FieldName}`s to `ColumnwiseBandMatrixField`s or multiples of `LinearAlgebra.I` - `FieldVectorView`, which maps `FieldName`s to `Field`s; this is used to wrap a `FieldVector` so that it can be used in conjunction with a `FieldMatrix` - `FieldVectorViewBroadcasted` and `FieldMatrixBroadcasted`, each of which can store unevaluated `Base.AbstractBroadcasted` objects, in addition to what `FieldVectorView` and `FieldMatrix` can already store - Supports standard `AbstractDict` functions like `keys` and `pairs`. - An individual block of a `FieldNameDict` can be accessed by calling `dict[key]`, and a range of blocks can be accessed by calling `dict[set]`, where `set` is a `FieldNameSet`. - Given a `FieldMatrix` `A`, a similar matrix that only contains identity matrix blocks can be constructed with `one(A)`. - `FieldNameDict`s can be used in broadcast expressions, which support the following operations: - `+`, `-`, or `*`, where each input is either a `FieldNameDict` or a `FieldVector` - `inv`, where the input is a diagonal `FieldMatrix` - The new methods for `Base.Broadcast.broadcasted` construct chains of `Field` broadcast expressions from `FieldNameDict` broadcast expressions on the fly, somewhat similarly to how broadcasting works for ClimaCore operators. - Add the `FieldMatrixSolver` struct, which solves an equation of the form `A * x = b`, where `A` is a `FieldMatrix` and where `x` and `b` are `FieldVector`s. - Add the `field_matrix_solve!` function, which works just like `ldiv!(x, A, b)`, except that it also takes a `FieldMatrixSolver` as its first argument. - Add four `FieldMatrixSolverAlgorithm`s, which can be nested inside of each other to build up more specialized algorithms: - `BlockDiagonalSolve`, which runs a "single field solver" for each block of the block diagonal matrix `A`; the single field solver can handle the four types of blocks: - Multiples of `LinearAlgebra.I` - Diagonal `ColumnwiseBandMatrixField`s - Tri-diagonal `ColumnwiseBandMatrixField`s (implementation of the Thomas algorithm) - Penta-diagonal `ColumnwiseBandMatrixField`s (implementation of the PTRANS-I algorithm) - `BlockLowerTriangularSolve`, which uses forward substitution to solve the equation for a block lower triangular matrix `A` - `SchurComplementSolve`, which generalizes what is currently in ClimaAtmos's `schur_complement_W.jl` file to any block matrix `A` with a diagonal block in the top-left corner - `ApproximateFactorizationSolve`, which lets us use "operator splitting" to approximately solve the equation for a diagonally dominant block matrix `A` - Add documentation for how to specify a `FieldMatrix` and use it in a linear solver, along with internal documentation for the new `FieldName`-based infrastructure. - Add unit tests for correctness, type stability, and allocations, and run them on both CPUs and GPUs through CI. - Test each single field solver on both a cell-center and a cell-face field. - Test each `FieldMatrixSolverAlgorithm` on block diagonal, block lower triangular, and block dense matrices. - Test solvers with identical structures to what we will use in ClimaAtmos for the following examples: - Dry dycore with implicit acoustic waves - Dry dycore with implicit acoustic waves and diffusion - Dry dycore + prognostic EDMF with implicit acoustic waves and SGS fluxes - Moist dycore + prognostic EDMF + tracers with implicit acoustic waves and SGS fluxes ### Internal Chagnes - Add a collection of "unrolled functions", whose return values can be inferred during compilation if their input values are all singleton types. - These are all implemented as combinations of `unrolled_zip`, `unrolled_map`, and `unrolled_foldl`. - Several of these need to have their recursion limits disabled for the unit tests to be type stable. - Add the `FieldNameTree` struct, which stores every `FieldName` that can be used to access `x` with `get_field(x, name)`. - A `name` can be checked for validity by calling `has_subtree_at_name(tree, name)`. - The children of `name` (the `FieldName`s that can be used to access the properties of `get_field(x, name)`) can be obtained by calling `child_names(name, tree)`. - Add the `FieldNameSet` struct, which stores a set of `FieldVectorKeys` (each of which is a `FieldName`) or a set of `FieldMatrixKeys` (each of which is an `NTuple{2, FieldName}`). - Roughly equivalent to the built-in `KeySet` for `AbstractDict`s, but specialized for `FieldNameDict`s. - Supports standard `AbstractSet` functions like `union` and `setdiff`, as well as custom functions like `set_complement` and `matrix_product_keys`. - Handles overlaps between `FieldName`s (that is, situations where one property of `x` lies inside another property of `x`) by storing a `FieldNameTree` that contains all available `FieldName`s. - Disable the recursion limits for several functions used to manipulate `FieldName`s, `FieldNameTree`s, and `FieldNameSet`s, as this is necessary in order for the unit tests to be type stable. - Remove the methods for `RecursiveApply.rmul` that specialize on `Number`, which is also necessary in order for the unit tests to be type stable. - These methods are no longer required, now that #1454 has been merged in. - Add support for calling `inv` on `BandMatrixRow`s. - Qualify the use of `CUDA.`@allowscalar`,` per Charlie's suggestion. - Fix some type instabilities in `matrix_field_test_utils.jl`. - Remove an unused variable name in `operator_matrices.jl`. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1436: Add FieldMatrix and linear solvers r=dennisYatunin a=dennisYatunin ## Purpose Third PR of #1230. Adds an interface for specifying block matrices of matrix fields and solving linear systems with these matrices. This will replace what is currently in `ClimaAtmos.jl/src/prognostic_equations/implicit/schur_complement_W.jl`, generalizing it for implicit diffusion and implicit EDMF. ## Content ### Main Changes - Add the `FieldName` struct, which is a singleton type that represents a chain of `getproperty` calls - Add the ``@name`` macro for constructing `FieldName`s, which checks whether its input expression is a syntactically valid chain of `getproperty` calls before calling the default constructor. - A `name` can be used to access a property or sub-property of an object `x` by calling `get_field(x, name)`. - An `internal_name` can be appended onto another `name` in order to access a property or sub-property of `get_field(x, name)`. - Add the `FieldNameDict` struct, which maps each key in a set of `FieldVectorKeys` or `FieldMatrixKeys` (see below) to a `Field` or some other object. - There are currently four subtypes of `FieldNameDict`: - `FieldMatrix` (the only user-facing subtype), which maps `NTuple{2, FieldName}`s to `ColumnwiseBandMatrixField`s or multiples of `LinearAlgebra.I` - `FieldVectorView`, which maps `FieldName`s to `Field`s; this is used to wrap a `FieldVector` so that it can be used in conjunction with a `FieldMatrix` - `FieldVectorViewBroadcasted` and `FieldMatrixBroadcasted`, each of which can store unevaluated `Base.AbstractBroadcasted` objects, in addition to what `FieldVectorView` and `FieldMatrix` can already store - Supports standard `AbstractDict` functions like `keys` and `pairs`. - An individual block of a `FieldNameDict` can be accessed by calling `dict[key]`, and a range of blocks can be accessed by calling `dict[set]`, where `set` is a `FieldNameSet`. - Given a `FieldMatrix` `A`, a similar matrix that only contains identity matrix blocks can be constructed with `one(A)`. - `FieldNameDict`s can be used in broadcast expressions, which support the following operations: - `+`, `-`, or `*`, where each input is either a `FieldNameDict` or a `FieldVector` - `inv`, where the input is a diagonal `FieldMatrix` - The new methods for `Base.Broadcast.broadcasted` construct chains of `Field` broadcast expressions from `FieldNameDict` broadcast expressions on the fly, somewhat similarly to how broadcasting works for ClimaCore operators. - Add the `FieldMatrixSolver` struct, which solves an equation of the form `A * x = b`, where `A` is a `FieldMatrix` and where `x` and `b` are `FieldVector`s. - Add the `field_matrix_solve!` function, which works just like `ldiv!(x, A, b)`, except that it also takes a `FieldMatrixSolver` as its first argument. - Add four `FieldMatrixSolverAlgorithm`s, which can be nested inside of each other to build up more specialized algorithms: - `BlockDiagonalSolve`, which runs a "single field solver" for each block of the block diagonal matrix `A`; the single field solver can handle the four types of blocks: - Multiples of `LinearAlgebra.I` - Diagonal `ColumnwiseBandMatrixField`s - Tri-diagonal `ColumnwiseBandMatrixField`s (implementation of the Thomas algorithm) - Penta-diagonal `ColumnwiseBandMatrixField`s (implementation of the PTRANS-I algorithm) - `BlockLowerTriangularSolve`, which uses forward substitution to solve the equation for a block lower triangular matrix `A` - `SchurComplementSolve`, which generalizes what is currently in ClimaAtmos's `schur_complement_W.jl` file to any block matrix `A` with a diagonal block in the top-left corner - `ApproximateFactorizationSolve`, which lets us use "operator splitting" to approximately solve the equation for a diagonally dominant block matrix `A` - Add documentation for how to specify a `FieldMatrix` and use it in a linear solver, along with internal documentation for the new `FieldName`-based infrastructure. - Add unit tests for correctness, type stability, and allocations, and run them on both CPUs and GPUs through CI. - Test each single field solver on both a cell-center and a cell-face field. - Test each `FieldMatrixSolverAlgorithm` on block diagonal, block lower triangular, and block dense matrices. - Test solvers with identical structures to what we will use in ClimaAtmos for the following examples: - Dry dycore with implicit acoustic waves - Dry dycore with implicit acoustic waves and diffusion - Dry dycore + prognostic EDMF with implicit acoustic waves and SGS fluxes - Moist dycore + prognostic EDMF + tracers with implicit acoustic waves and SGS fluxes ### Internal Chagnes - Add a collection of "unrolled functions", whose return values can be inferred during compilation if their input values are all singleton types. - These are all implemented as combinations of `unrolled_zip`, `unrolled_map`, and `unrolled_foldl`. - Several of these need to have their recursion limits disabled for the unit tests to be type stable. - Add the `FieldNameTree` struct, which stores every `FieldName` that can be used to access `x` with `get_field(x, name)`. - A `name` can be checked for validity by calling `has_subtree_at_name(tree, name)`. - The children of `name` (the `FieldName`s that can be used to access the properties of `get_field(x, name)`) can be obtained by calling `child_names(name, tree)`. - Add the `FieldNameSet` struct, which stores a set of `FieldVectorKeys` (each of which is a `FieldName`) or a set of `FieldMatrixKeys` (each of which is an `NTuple{2, FieldName}`). - Roughly equivalent to the built-in `KeySet` for `AbstractDict`s, but specialized for `FieldNameDict`s. - Supports standard `AbstractSet` functions like `union` and `setdiff`, as well as custom functions like `set_complement` and `matrix_product_keys`. - Handles overlaps between `FieldName`s (that is, situations where one property of `x` lies inside another property of `x`) by storing a `FieldNameTree` that contains all available `FieldName`s. - Disable the recursion limits for several functions used to manipulate `FieldName`s, `FieldNameTree`s, and `FieldNameSet`s, as this is necessary in order for the unit tests to be type stable. - Remove the methods for `RecursiveApply.rmul` that specialize on `Number`, which is also necessary in order for the unit tests to be type stable. - These methods are no longer required, now that #1454 has been merged in. - Add support for calling `inv` on `BandMatrixRow`s. - Qualify the use of `CUDA.`@allowscalar`,` per Charlie's suggestion. - Fix some type instabilities in `matrix_field_test_utils.jl`. - Remove an unused variable name in `operator_matrices.jl`. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1436: Add FieldMatrix and linear solvers r=dennisYatunin a=dennisYatunin ## Purpose Third PR of #1230. Adds an interface for specifying block matrices of matrix fields and solving linear systems with these matrices. This will replace what is currently in `ClimaAtmos.jl/src/prognostic_equations/implicit/schur_complement_W.jl`, generalizing it for implicit diffusion and implicit EDMF. ## Content ### Main Changes - Add the `FieldName` struct, which is a singleton type that represents a chain of `getproperty` calls - Add the ``@name`` macro for constructing `FieldName`s, which checks whether its input expression is a syntactically valid chain of `getproperty` calls before calling the default constructor. - A `name` can be used to access a property or sub-property of an object `x` by calling `get_field(x, name)`. - An `internal_name` can be appended onto another `name` in order to access a property or sub-property of `get_field(x, name)`. - Add the `FieldNameDict` struct, which maps each key in a set of `FieldVectorKeys` or `FieldMatrixKeys` (see below) to a `Field` or some other object. - There are currently four subtypes of `FieldNameDict`: - `FieldMatrix` (the only user-facing subtype), which maps `NTuple{2, FieldName}`s to `ColumnwiseBandMatrixField`s or multiples of `LinearAlgebra.I` - `FieldVectorView`, which maps `FieldName`s to `Field`s; this is used to wrap a `FieldVector` so that it can be used in conjunction with a `FieldMatrix` - `FieldVectorViewBroadcasted` and `FieldMatrixBroadcasted`, each of which can store unevaluated `Base.AbstractBroadcasted` objects, in addition to what `FieldVectorView` and `FieldMatrix` can already store - Supports standard `AbstractDict` functions like `keys` and `pairs`. - An individual block of a `FieldNameDict` can be accessed by calling `dict[key]`, and a range of blocks can be accessed by calling `dict[set]`, where `set` is a `FieldNameSet`. - Given a `FieldMatrix` `A`, a similar matrix that only contains identity matrix blocks can be constructed with `one(A)`. - `FieldNameDict`s can be used in broadcast expressions, which support the following operations: - `+`, `-`, or `*`, where each input is either a `FieldNameDict` or a `FieldVector` - `inv`, where the input is a diagonal `FieldMatrix` - The new methods for `Base.Broadcast.broadcasted` construct chains of `Field` broadcast expressions from `FieldNameDict` broadcast expressions on the fly, somewhat similarly to how broadcasting works for ClimaCore operators. - Add the `FieldMatrixSolver` struct, which solves an equation of the form `A * x = b`, where `A` is a `FieldMatrix` and where `x` and `b` are `FieldVector`s. - Add the `field_matrix_solve!` function, which works just like `ldiv!(x, A, b)`, except that it also takes a `FieldMatrixSolver` as its first argument. - Add four `FieldMatrixSolverAlgorithm`s, which can be nested inside of each other to build up more specialized algorithms: - `BlockDiagonalSolve`, which runs a "single field solver" for each block of the block diagonal matrix `A`; the single field solver can handle the four types of blocks: - Multiples of `LinearAlgebra.I` - Diagonal `ColumnwiseBandMatrixField`s - Tri-diagonal `ColumnwiseBandMatrixField`s (implementation of the Thomas algorithm) - Penta-diagonal `ColumnwiseBandMatrixField`s (implementation of the PTRANS-I algorithm) - `BlockLowerTriangularSolve`, which uses forward substitution to solve the equation for a block lower triangular matrix `A` - `SchurComplementSolve`, which generalizes what is currently in ClimaAtmos's `schur_complement_W.jl` file to any block matrix `A` with a diagonal block in the top-left corner - `ApproximateFactorizationSolve`, which lets us use "operator splitting" to approximately solve the equation for a diagonally dominant block matrix `A` - Add documentation for how to specify a `FieldMatrix` and use it in a linear solver, along with internal documentation for the new `FieldName`-based infrastructure. - Add unit tests for correctness, type stability, and allocations, and run them on both CPUs and GPUs through CI. - Test each single field solver on both a cell-center and a cell-face field. - Test each `FieldMatrixSolverAlgorithm` on block diagonal, block lower triangular, and block dense matrices. - Test solvers with identical structures to what we will use in ClimaAtmos for the following examples: - Dry dycore with implicit acoustic waves - Dry dycore with implicit acoustic waves and diffusion - Dry dycore + prognostic EDMF with implicit acoustic waves and SGS fluxes - Moist dycore + prognostic EDMF + tracers with implicit acoustic waves and SGS fluxes ### Internal Chagnes - Add a collection of "unrolled functions", whose return values can be inferred during compilation if their input values are all singleton types. - These are all implemented as combinations of `unrolled_zip`, `unrolled_map`, and `unrolled_foldl`. - Several of these need to have their recursion limits disabled for the unit tests to be type stable. - Add the `FieldNameTree` struct, which stores every `FieldName` that can be used to access `x` with `get_field(x, name)`. - A `name` can be checked for validity by calling `has_subtree_at_name(tree, name)`. - The children of `name` (the `FieldName`s that can be used to access the properties of `get_field(x, name)`) can be obtained by calling `child_names(name, tree)`. - Add the `FieldNameSet` struct, which stores a set of `FieldVectorKeys` (each of which is a `FieldName`) or a set of `FieldMatrixKeys` (each of which is an `NTuple{2, FieldName}`). - Roughly equivalent to the built-in `KeySet` for `AbstractDict`s, but specialized for `FieldNameDict`s. - Supports standard `AbstractSet` functions like `union` and `setdiff`, as well as custom functions like `set_complement` and `matrix_product_keys`. - Handles overlaps between `FieldName`s (that is, situations where one property of `x` lies inside another property of `x`) by storing a `FieldNameTree` that contains all available `FieldName`s. - Disable the recursion limits for several functions used to manipulate `FieldName`s, `FieldNameTree`s, and `FieldNameSet`s, as this is necessary in order for the unit tests to be type stable. - Remove the methods for `RecursiveApply.rmul` that specialize on `Number`, which is also necessary in order for the unit tests to be type stable. - These methods are no longer required, now that #1454 has been merged in. - Add support for calling `inv` on `BandMatrixRow`s. - Qualify the use of `CUDA.`@allowscalar`,` per Charlie's suggestion. - Fix some type instabilities in `matrix_field_test_utils.jl`. - Remove an unused variable name in `operator_matrices.jl`. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1436: Add FieldMatrix and linear solvers r=dennisYatunin a=dennisYatunin ## Purpose Third PR of #1230. Adds an interface for specifying block matrices of matrix fields and solving linear systems with these matrices. This will replace what is currently in `ClimaAtmos.jl/src/prognostic_equations/implicit/schur_complement_W.jl`, generalizing it for implicit diffusion and implicit EDMF. ## Content ### Main Changes - Add the `FieldName` struct, which is a singleton type that represents a chain of `getproperty` calls - Add the ``@name`` macro for constructing `FieldName`s, which checks whether its input expression is a syntactically valid chain of `getproperty` calls before calling the default constructor. - A `name` can be used to access a property or sub-property of an object `x` by calling `get_field(x, name)`. - An `internal_name` can be appended onto another `name` in order to access a property or sub-property of `get_field(x, name)`. - Add the `FieldNameDict` struct, which maps each key in a set of `FieldVectorKeys` or `FieldMatrixKeys` (see below) to a `Field` or some other object. - There are currently four subtypes of `FieldNameDict`: - `FieldMatrix` (the only user-facing subtype), which maps `NTuple{2, FieldName}`s to `ColumnwiseBandMatrixField`s or multiples of `LinearAlgebra.I` - `FieldVectorView`, which maps `FieldName`s to `Field`s; this is used to wrap a `FieldVector` so that it can be used in conjunction with a `FieldMatrix` - `FieldVectorViewBroadcasted` and `FieldMatrixBroadcasted`, each of which can store unevaluated `Base.AbstractBroadcasted` objects, in addition to what `FieldVectorView` and `FieldMatrix` can already store - Supports standard `AbstractDict` functions like `keys` and `pairs`. - An individual block of a `FieldNameDict` can be accessed by calling `dict[key]`, and a range of blocks can be accessed by calling `dict[set]`, where `set` is a `FieldNameSet`. - Given a `FieldMatrix` `A`, a similar matrix that only contains identity matrix blocks can be constructed with `one(A)`. - `FieldNameDict`s can be used in broadcast expressions, which support the following operations: - `+`, `-`, or `*`, where each input is either a `FieldNameDict` or a `FieldVector` - `inv`, where the input is a diagonal `FieldMatrix` - The new methods for `Base.Broadcast.broadcasted` construct chains of `Field` broadcast expressions from `FieldNameDict` broadcast expressions on the fly, somewhat similarly to how broadcasting works for ClimaCore operators. - Add the `FieldMatrixSolver` struct, which solves an equation of the form `A * x = b`, where `A` is a `FieldMatrix` and where `x` and `b` are `FieldVector`s. - Add the `field_matrix_solve!` function, which works just like `ldiv!(x, A, b)`, except that it also takes a `FieldMatrixSolver` as its first argument. - Add four `FieldMatrixSolverAlgorithm`s, which can be nested inside of each other to build up more specialized algorithms: - `BlockDiagonalSolve`, which runs a "single field solver" for each block of the block diagonal matrix `A`; the single field solver can handle the four types of blocks: - Multiples of `LinearAlgebra.I` - Diagonal `ColumnwiseBandMatrixField`s - Tri-diagonal `ColumnwiseBandMatrixField`s (implementation of the Thomas algorithm) - Penta-diagonal `ColumnwiseBandMatrixField`s (implementation of the PTRANS-I algorithm) - `BlockLowerTriangularSolve`, which uses forward substitution to solve the equation for a block lower triangular matrix `A` - `SchurComplementSolve`, which generalizes what is currently in ClimaAtmos's `schur_complement_W.jl` file to any block matrix `A` with a diagonal block in the top-left corner - `ApproximateFactorizationSolve`, which lets us use "operator splitting" to approximately solve the equation for a diagonally dominant block matrix `A` - Add documentation for how to specify a `FieldMatrix` and use it in a linear solver, along with internal documentation for the new `FieldName`-based infrastructure. - Add unit tests for correctness, type stability, and allocations, and run them on both CPUs and GPUs through CI. - Test each single field solver on both a cell-center and a cell-face field. - Test each `FieldMatrixSolverAlgorithm` on block diagonal, block lower triangular, and block dense matrices. - Test solvers with identical structures to what we will use in ClimaAtmos for the following examples: - Dry dycore with implicit acoustic waves - Dry dycore with implicit acoustic waves and diffusion - Dry dycore + prognostic EDMF with implicit acoustic waves and SGS fluxes - Moist dycore + prognostic EDMF + tracers with implicit acoustic waves and SGS fluxes ### Internal Chagnes - Add a collection of "unrolled functions", whose return values can be inferred during compilation if their input values are all singleton types. - These are all implemented as combinations of `unrolled_zip`, `unrolled_map`, and `unrolled_foldl`. - Several of these need to have their recursion limits disabled for the unit tests to be type stable. - Add the `FieldNameTree` struct, which stores every `FieldName` that can be used to access `x` with `get_field(x, name)`. - A `name` can be checked for validity by calling `has_subtree_at_name(tree, name)`. - The children of `name` (the `FieldName`s that can be used to access the properties of `get_field(x, name)`) can be obtained by calling `child_names(name, tree)`. - Add the `FieldNameSet` struct, which stores a set of `FieldVectorKeys` (each of which is a `FieldName`) or a set of `FieldMatrixKeys` (each of which is an `NTuple{2, FieldName}`). - Roughly equivalent to the built-in `KeySet` for `AbstractDict`s, but specialized for `FieldNameDict`s. - Supports standard `AbstractSet` functions like `union` and `setdiff`, as well as custom functions like `set_complement` and `matrix_product_keys`. - Handles overlaps between `FieldName`s (that is, situations where one property of `x` lies inside another property of `x`) by storing a `FieldNameTree` that contains all available `FieldName`s. - Disable the recursion limits for several functions used to manipulate `FieldName`s, `FieldNameTree`s, and `FieldNameSet`s, as this is necessary in order for the unit tests to be type stable. - Remove the methods for `RecursiveApply.rmul` that specialize on `Number`, which is also necessary in order for the unit tests to be type stable. - These methods are no longer required, now that #1454 has been merged in. - Add support for calling `inv` on `BandMatrixRow`s. - Qualify the use of `CUDA.`@allowscalar`,` per Charlie's suggestion. - Fix some type instabilities in `matrix_field_test_utils.jl`. - Remove an unused variable name in `operator_matrices.jl`. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1436: Add FieldMatrix and linear solvers r=dennisYatunin a=dennisYatunin ## Purpose Third PR of #1230. Adds an interface for specifying block matrices of matrix fields and solving linear systems with these matrices. This will replace what is currently in `ClimaAtmos.jl/src/prognostic_equations/implicit/schur_complement_W.jl`, generalizing it for implicit diffusion and implicit EDMF. ## Content ### Main Changes - Add the `FieldName` struct, which is a singleton type that represents a chain of `getproperty` calls - Add the ``@name`` macro for constructing `FieldName`s, which checks whether its input expression is a syntactically valid chain of `getproperty` calls before calling the default constructor. - A `name` can be used to access a property or sub-property of an object `x` by calling `get_field(x, name)`. - An `internal_name` can be appended onto another `name` in order to access a property or sub-property of `get_field(x, name)`. - Add the `FieldNameDict` struct, which maps each key in a set of `FieldVectorKeys` or `FieldMatrixKeys` (see below) to a `Field` or some other object. - There are currently four subtypes of `FieldNameDict`: - `FieldMatrix` (the only user-facing subtype), which maps `NTuple{2, FieldName}`s to `ColumnwiseBandMatrixField`s or multiples of `LinearAlgebra.I` - `FieldVectorView`, which maps `FieldName`s to `Field`s; this is used to wrap a `FieldVector` so that it can be used in conjunction with a `FieldMatrix` - `FieldVectorViewBroadcasted` and `FieldMatrixBroadcasted`, each of which can store unevaluated `Base.AbstractBroadcasted` objects, in addition to what `FieldVectorView` and `FieldMatrix` can already store - Supports standard `AbstractDict` functions like `keys` and `pairs`. - An individual block of a `FieldNameDict` can be accessed by calling `dict[key]`, and a range of blocks can be accessed by calling `dict[set]`, where `set` is a `FieldNameSet`. - Given a `FieldMatrix` `A`, a similar matrix that only contains identity matrix blocks can be constructed with `one(A)`. - `FieldNameDict`s can be used in broadcast expressions, which support the following operations: - `+`, `-`, or `*`, where each input is either a `FieldNameDict` or a `FieldVector` - `inv`, where the input is a diagonal `FieldMatrix` - The new methods for `Base.Broadcast.broadcasted` construct chains of `Field` broadcast expressions from `FieldNameDict` broadcast expressions on the fly, somewhat similarly to how broadcasting works for ClimaCore operators. - Add the `FieldMatrixSolver` struct, which solves an equation of the form `A * x = b`, where `A` is a `FieldMatrix` and where `x` and `b` are `FieldVector`s. - Add the `field_matrix_solve!` function, which works just like `ldiv!(x, A, b)`, except that it also takes a `FieldMatrixSolver` as its first argument. - Add four `FieldMatrixSolverAlgorithm`s, which can be nested inside of each other to build up more specialized algorithms: - `BlockDiagonalSolve`, which runs a "single field solver" for each block of the block diagonal matrix `A`; the single field solver can handle the four types of blocks: - Multiples of `LinearAlgebra.I` - Diagonal `ColumnwiseBandMatrixField`s - Tri-diagonal `ColumnwiseBandMatrixField`s (implementation of the Thomas algorithm) - Penta-diagonal `ColumnwiseBandMatrixField`s (implementation of the PTRANS-I algorithm) - `BlockLowerTriangularSolve`, which uses forward substitution to solve the equation for a block lower triangular matrix `A` - `SchurComplementSolve`, which generalizes what is currently in ClimaAtmos's `schur_complement_W.jl` file to any block matrix `A` with a diagonal block in the top-left corner - `ApproximateFactorizationSolve`, which lets us use "operator splitting" to approximately solve the equation for a diagonally dominant block matrix `A` - Add documentation for how to specify a `FieldMatrix` and use it in a linear solver, along with internal documentation for the new `FieldName`-based infrastructure. - Add unit tests for correctness, type stability, and allocations, and run them on both CPUs and GPUs through CI. - Test each single field solver on both a cell-center and a cell-face field. - Test each `FieldMatrixSolverAlgorithm` on block diagonal, block lower triangular, and block dense matrices. - Test solvers with identical structures to what we will use in ClimaAtmos for the following examples: - Dry dycore with implicit acoustic waves - Dry dycore with implicit acoustic waves and diffusion - Dry dycore + prognostic EDMF with implicit acoustic waves and SGS fluxes - Moist dycore + prognostic EDMF + tracers with implicit acoustic waves and SGS fluxes ### Internal Chagnes - Add a collection of "unrolled functions", whose return values can be inferred during compilation if their input values are all singleton types. - These are all implemented as combinations of `unrolled_zip`, `unrolled_map`, and `unrolled_foldl`. - Several of these need to have their recursion limits disabled for the unit tests to be type stable. - Add the `FieldNameTree` struct, which stores every `FieldName` that can be used to access `x` with `get_field(x, name)`. - A `name` can be checked for validity by calling `has_subtree_at_name(tree, name)`. - The children of `name` (the `FieldName`s that can be used to access the properties of `get_field(x, name)`) can be obtained by calling `child_names(name, tree)`. - Add the `FieldNameSet` struct, which stores a set of `FieldVectorKeys` (each of which is a `FieldName`) or a set of `FieldMatrixKeys` (each of which is an `NTuple{2, FieldName}`). - Roughly equivalent to the built-in `KeySet` for `AbstractDict`s, but specialized for `FieldNameDict`s. - Supports standard `AbstractSet` functions like `union` and `setdiff`, as well as custom functions like `set_complement` and `matrix_product_keys`. - Handles overlaps between `FieldName`s (that is, situations where one property of `x` lies inside another property of `x`) by storing a `FieldNameTree` that contains all available `FieldName`s. - Disable the recursion limits for several functions used to manipulate `FieldName`s, `FieldNameTree`s, and `FieldNameSet`s, as this is necessary in order for the unit tests to be type stable. - Remove the methods for `RecursiveApply.rmul` that specialize on `Number`, which is also necessary in order for the unit tests to be type stable. - These methods are no longer required, now that #1454 has been merged in. - Add support for calling `inv` on `BandMatrixRow`s. - Qualify the use of `CUDA.`@allowscalar`,` per Charlie's suggestion. - Fix some type instabilities in `matrix_field_test_utils.jl`. - Remove an unused variable name in `operator_matrices.jl`. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1436: Add FieldMatrix and linear solvers r=dennisYatunin a=dennisYatunin ## Purpose Third PR of #1230. Adds an interface for specifying block matrices of matrix fields and solving linear systems with these matrices. This will replace what is currently in `ClimaAtmos.jl/src/prognostic_equations/implicit/schur_complement_W.jl`, generalizing it for implicit diffusion and implicit EDMF. ## Content ### Main Changes - Add the `FieldName` struct, which is a singleton type that represents a chain of `getproperty` calls - Add the ``@name`` macro for constructing `FieldName`s, which checks whether its input expression is a syntactically valid chain of `getproperty` calls before calling the default constructor. - A `name` can be used to access a property or sub-property of an object `x` by calling `get_field(x, name)`. - An `internal_name` can be appended onto another `name` in order to access a property or sub-property of `get_field(x, name)`. - Add the `FieldNameDict` struct, which maps each key in a set of `FieldVectorKeys` or `FieldMatrixKeys` (see below) to a `Field` or some other object. - There are currently four subtypes of `FieldNameDict`: - `FieldMatrix` (the only user-facing subtype), which maps `NTuple{2, FieldName}`s to `ColumnwiseBandMatrixField`s or multiples of `LinearAlgebra.I` - `FieldVectorView`, which maps `FieldName`s to `Field`s; this is used to wrap a `FieldVector` so that it can be used in conjunction with a `FieldMatrix` - `FieldVectorViewBroadcasted` and `FieldMatrixBroadcasted`, each of which can store unevaluated `Base.AbstractBroadcasted` objects, in addition to what `FieldVectorView` and `FieldMatrix` can already store - Supports standard `AbstractDict` functions like `keys` and `pairs`. - An individual block of a `FieldNameDict` can be accessed by calling `dict[key]`, and a range of blocks can be accessed by calling `dict[set]`, where `set` is a `FieldNameSet`. - Given a `FieldMatrix` `A`, a similar matrix that only contains identity matrix blocks can be constructed with `one(A)`. - `FieldNameDict`s can be used in broadcast expressions, which support the following operations: - `+`, `-`, or `*`, where each input is either a `FieldNameDict` or a `FieldVector` - `inv`, where the input is a diagonal `FieldMatrix` - The new methods for `Base.Broadcast.broadcasted` construct chains of `Field` broadcast expressions from `FieldNameDict` broadcast expressions on the fly, somewhat similarly to how broadcasting works for ClimaCore operators. - Add the `FieldMatrixSolver` struct, which solves an equation of the form `A * x = b`, where `A` is a `FieldMatrix` and where `x` and `b` are `FieldVector`s. - Add the `field_matrix_solve!` function, which works just like `ldiv!(x, A, b)`, except that it also takes a `FieldMatrixSolver` as its first argument. - Add four `FieldMatrixSolverAlgorithm`s, which can be nested inside of each other to build up more specialized algorithms: - `BlockDiagonalSolve`, which runs a "single field solver" for each block of the block diagonal matrix `A`; the single field solver can handle the four types of blocks: - Multiples of `LinearAlgebra.I` - Diagonal `ColumnwiseBandMatrixField`s - Tri-diagonal `ColumnwiseBandMatrixField`s (implementation of the Thomas algorithm) - Penta-diagonal `ColumnwiseBandMatrixField`s (implementation of the PTRANS-I algorithm) - `BlockLowerTriangularSolve`, which uses forward substitution to solve the equation for a block lower triangular matrix `A` - `SchurComplementSolve`, which generalizes what is currently in ClimaAtmos's `schur_complement_W.jl` file to any block matrix `A` with a diagonal block in the top-left corner - `ApproximateFactorizationSolve`, which lets us use "operator splitting" to approximately solve the equation for a diagonally dominant block matrix `A` - Add documentation for how to specify a `FieldMatrix` and use it in a linear solver, along with internal documentation for the new `FieldName`-based infrastructure. - Add unit tests for correctness, type stability, and allocations, and run them on both CPUs and GPUs through CI. - Test each single field solver on both a cell-center and a cell-face field. - Test each `FieldMatrixSolverAlgorithm` on block diagonal, block lower triangular, and block dense matrices. - Test solvers with identical structures to what we will use in ClimaAtmos for the following examples: - Dry dycore with implicit acoustic waves - Dry dycore with implicit acoustic waves and diffusion - Dry dycore + prognostic EDMF with implicit acoustic waves and SGS fluxes - Moist dycore + prognostic EDMF + tracers with implicit acoustic waves and SGS fluxes ### Internal Chagnes - Add a collection of "unrolled functions", whose return values can be inferred during compilation if their input values are all singleton types. - These are all implemented as combinations of `unrolled_zip`, `unrolled_map`, and `unrolled_foldl`. - Several of these need to have their recursion limits disabled for the unit tests to be type stable. - Add the `FieldNameTree` struct, which stores every `FieldName` that can be used to access `x` with `get_field(x, name)`. - A `name` can be checked for validity by calling `has_subtree_at_name(tree, name)`. - The children of `name` (the `FieldName`s that can be used to access the properties of `get_field(x, name)`) can be obtained by calling `child_names(name, tree)`. - Add the `FieldNameSet` struct, which stores a set of `FieldVectorKeys` (each of which is a `FieldName`) or a set of `FieldMatrixKeys` (each of which is an `NTuple{2, FieldName}`). - Roughly equivalent to the built-in `KeySet` for `AbstractDict`s, but specialized for `FieldNameDict`s. - Supports standard `AbstractSet` functions like `union` and `setdiff`, as well as custom functions like `set_complement` and `matrix_product_keys`. - Handles overlaps between `FieldName`s (that is, situations where one property of `x` lies inside another property of `x`) by storing a `FieldNameTree` that contains all available `FieldName`s. - Disable the recursion limits for several functions used to manipulate `FieldName`s, `FieldNameTree`s, and `FieldNameSet`s, as this is necessary in order for the unit tests to be type stable. - Remove the methods for `RecursiveApply.rmul` that specialize on `Number`, which is also necessary in order for the unit tests to be type stable. - These methods are no longer required, now that #1454 has been merged in. - Add support for calling `inv` on `BandMatrixRow`s. - Qualify the use of `CUDA.`@allowscalar`,` per Charlie's suggestion. - Fix some type instabilities in `matrix_field_test_utils.jl`. - Remove an unused variable name in `operator_matrices.jl`. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1436: Add FieldMatrix and linear solvers r=dennisYatunin a=dennisYatunin ## Purpose Third PR of #1230. Adds an interface for specifying block matrices of matrix fields and solving linear systems with these matrices. This will replace what is currently in `ClimaAtmos.jl/src/prognostic_equations/implicit/schur_complement_W.jl`, generalizing it for implicit diffusion and implicit EDMF. ## Content ### Main Changes - Add the `FieldName` struct, which is a singleton type that represents a chain of `getproperty` calls - Add the ``@name`` macro for constructing `FieldName`s, which checks whether its input expression is a syntactically valid chain of `getproperty` calls before calling the default constructor. - A `name` can be used to access a property or sub-property of an object `x` by calling `get_field(x, name)`. - An `internal_name` can be appended onto another `name` in order to access a property or sub-property of `get_field(x, name)`. - Add the `FieldNameDict` struct, which maps each key in a set of `FieldVectorKeys` or `FieldMatrixKeys` (see below) to a `Field` or some other object. - There are currently four subtypes of `FieldNameDict`: - `FieldMatrix` (the only user-facing subtype), which maps `NTuple{2, FieldName}`s to `ColumnwiseBandMatrixField`s or multiples of `LinearAlgebra.I` - `FieldVectorView`, which maps `FieldName`s to `Field`s; this is used to wrap a `FieldVector` so that it can be used in conjunction with a `FieldMatrix` - `FieldVectorViewBroadcasted` and `FieldMatrixBroadcasted`, each of which can store unevaluated `Base.AbstractBroadcasted` objects, in addition to what `FieldVectorView` and `FieldMatrix` can already store - Supports standard `AbstractDict` functions like `keys` and `pairs`. - An individual block of a `FieldNameDict` can be accessed by calling `dict[key]`, and a range of blocks can be accessed by calling `dict[set]`, where `set` is a `FieldNameSet`. - Given a `FieldMatrix` `A`, a similar matrix that only contains identity matrix blocks can be constructed with `one(A)`. - `FieldNameDict`s can be used in broadcast expressions, which support the following operations: - `+`, `-`, or `*`, where each input is either a `FieldNameDict` or a `FieldVector` - `inv`, where the input is a diagonal `FieldMatrix` - The new methods for `Base.Broadcast.broadcasted` construct chains of `Field` broadcast expressions from `FieldNameDict` broadcast expressions on the fly, somewhat similarly to how broadcasting works for ClimaCore operators. - Add the `FieldMatrixSolver` struct, which solves an equation of the form `A * x = b`, where `A` is a `FieldMatrix` and where `x` and `b` are `FieldVector`s. - Add the `field_matrix_solve!` function, which works just like `ldiv!(x, A, b)`, except that it also takes a `FieldMatrixSolver` as its first argument. - Add four `FieldMatrixSolverAlgorithm`s, which can be nested inside of each other to build up more specialized algorithms: - `BlockDiagonalSolve`, which runs a "single field solver" for each block of the block diagonal matrix `A`; the single field solver can handle the four types of blocks: - Multiples of `LinearAlgebra.I` - Diagonal `ColumnwiseBandMatrixField`s - Tri-diagonal `ColumnwiseBandMatrixField`s (implementation of the Thomas algorithm) - Penta-diagonal `ColumnwiseBandMatrixField`s (implementation of the PTRANS-I algorithm) - `BlockLowerTriangularSolve`, which uses forward substitution to solve the equation for a block lower triangular matrix `A` - `SchurComplementSolve`, which generalizes what is currently in ClimaAtmos's `schur_complement_W.jl` file to any block matrix `A` with a diagonal block in the top-left corner - `ApproximateFactorizationSolve`, which lets us use "operator splitting" to approximately solve the equation for a diagonally dominant block matrix `A` - Add documentation for how to specify a `FieldMatrix` and use it in a linear solver, along with internal documentation for the new `FieldName`-based infrastructure. - Add unit tests for correctness, type stability, and allocations, and run them on both CPUs and GPUs through CI. - Test each single field solver on both a cell-center and a cell-face field. - Test each `FieldMatrixSolverAlgorithm` on block diagonal, block lower triangular, and block dense matrices. - Test solvers with identical structures to what we will use in ClimaAtmos for the following examples: - Dry dycore with implicit acoustic waves - Dry dycore with implicit acoustic waves and diffusion - Dry dycore + prognostic EDMF with implicit acoustic waves and SGS fluxes - Moist dycore + prognostic EDMF + tracers with implicit acoustic waves and SGS fluxes ### Internal Chagnes - Add a collection of "unrolled functions", whose return values can be inferred during compilation if their input values are all singleton types. - These are all implemented as combinations of `unrolled_zip`, `unrolled_map`, and `unrolled_foldl`. - Several of these need to have their recursion limits disabled for the unit tests to be type stable. - Add the `FieldNameTree` struct, which stores every `FieldName` that can be used to access `x` with `get_field(x, name)`. - A `name` can be checked for validity by calling `has_subtree_at_name(tree, name)`. - The children of `name` (the `FieldName`s that can be used to access the properties of `get_field(x, name)`) can be obtained by calling `child_names(name, tree)`. - Add the `FieldNameSet` struct, which stores a set of `FieldVectorKeys` (each of which is a `FieldName`) or a set of `FieldMatrixKeys` (each of which is an `NTuple{2, FieldName}`). - Roughly equivalent to the built-in `KeySet` for `AbstractDict`s, but specialized for `FieldNameDict`s. - Supports standard `AbstractSet` functions like `union` and `setdiff`, as well as custom functions like `set_complement` and `matrix_product_keys`. - Handles overlaps between `FieldName`s (that is, situations where one property of `x` lies inside another property of `x`) by storing a `FieldNameTree` that contains all available `FieldName`s. - Disable the recursion limits for several functions used to manipulate `FieldName`s, `FieldNameTree`s, and `FieldNameSet`s, as this is necessary in order for the unit tests to be type stable. - Remove the methods for `RecursiveApply.rmul` that specialize on `Number`, which is also necessary in order for the unit tests to be type stable. - These methods are no longer required, now that #1454 has been merged in. - Add support for calling `inv` on `BandMatrixRow`s. - Qualify the use of `CUDA.`@allowscalar`,` per Charlie's suggestion. - Fix some type instabilities in `matrix_field_test_utils.jl`. - Remove an unused variable name in `operator_matrices.jl`. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
1436: Add FieldMatrix and linear solvers r=dennisYatunin a=dennisYatunin ## Purpose Third PR of #1230. Adds an interface for specifying block matrices of matrix fields and solving linear systems with these matrices. This will replace what is currently in `ClimaAtmos.jl/src/prognostic_equations/implicit/schur_complement_W.jl`, generalizing it for implicit diffusion and implicit EDMF. ## Content ### Main Changes - Add the `FieldName` struct, which is a singleton type that represents a chain of `getproperty` calls - Add the ``@name`` macro for constructing `FieldName`s, which checks whether its input expression is a syntactically valid chain of `getproperty` calls before calling the default constructor. - A `name` can be used to access a property or sub-property of an object `x` by calling `get_field(x, name)`. - An `internal_name` can be appended onto another `name` in order to access a property or sub-property of `get_field(x, name)`. - Add the `FieldNameDict` struct, which maps each key in a set of `FieldVectorKeys` or `FieldMatrixKeys` (see below) to a `Field` or some other object. - There are currently four subtypes of `FieldNameDict`: - `FieldMatrix` (the only user-facing subtype), which maps `NTuple{2, FieldName}`s to `ColumnwiseBandMatrixField`s or multiples of `LinearAlgebra.I` - `FieldVectorView`, which maps `FieldName`s to `Field`s; this is used to wrap a `FieldVector` so that it can be used in conjunction with a `FieldMatrix` - `FieldVectorViewBroadcasted` and `FieldMatrixBroadcasted`, each of which can store unevaluated `Base.AbstractBroadcasted` objects, in addition to what `FieldVectorView` and `FieldMatrix` can already store - Supports standard `AbstractDict` functions like `keys` and `pairs`. - An individual block of a `FieldNameDict` can be accessed by calling `dict[key]`, and a range of blocks can be accessed by calling `dict[set]`, where `set` is a `FieldNameSet`. - Given a `FieldMatrix` `A`, a similar matrix that only contains identity matrix blocks can be constructed with `one(A)`. - `FieldNameDict`s can be used in broadcast expressions, which support the following operations: - `+`, `-`, or `*`, where each input is either a `FieldNameDict` or a `FieldVector` - `inv`, where the input is a diagonal `FieldMatrix` - The new methods for `Base.Broadcast.broadcasted` construct chains of `Field` broadcast expressions from `FieldNameDict` broadcast expressions on the fly, somewhat similarly to how broadcasting works for ClimaCore operators. - Add the `FieldMatrixSolver` struct, which solves an equation of the form `A * x = b`, where `A` is a `FieldMatrix` and where `x` and `b` are `FieldVector`s. - Add the `field_matrix_solve!` function, which works just like `ldiv!(x, A, b)`, except that it also takes a `FieldMatrixSolver` as its first argument. - Add four `FieldMatrixSolverAlgorithm`s, which can be nested inside of each other to build up more specialized algorithms: - `BlockDiagonalSolve`, which runs a "single field solver" for each block of the block diagonal matrix `A`; the single field solver can handle the four types of blocks: - Multiples of `LinearAlgebra.I` - Diagonal `ColumnwiseBandMatrixField`s - Tri-diagonal `ColumnwiseBandMatrixField`s (implementation of the Thomas algorithm) - Penta-diagonal `ColumnwiseBandMatrixField`s (implementation of the PTRANS-I algorithm) - `BlockLowerTriangularSolve`, which uses forward substitution to solve the equation for a block lower triangular matrix `A` - `SchurComplementSolve`, which generalizes what is currently in ClimaAtmos's `schur_complement_W.jl` file to any block matrix `A` with a diagonal block in the top-left corner - `ApproximateFactorizationSolve`, which lets us use "operator splitting" to approximately solve the equation for a diagonally dominant block matrix `A` - Add documentation for how to specify a `FieldMatrix` and use it in a linear solver, along with internal documentation for the new `FieldName`-based infrastructure. - Add unit tests for correctness, type stability, and allocations, and run them on both CPUs and GPUs through CI. - Test each single field solver on both a cell-center and a cell-face field. - Test each `FieldMatrixSolverAlgorithm` on block diagonal, block lower triangular, and block dense matrices. - Test solvers with identical structures to what we will use in ClimaAtmos for the following examples: - Dry dycore with implicit acoustic waves - Dry dycore with implicit acoustic waves and diffusion - Dry dycore + prognostic EDMF with implicit acoustic waves and SGS fluxes - Moist dycore + prognostic EDMF + tracers with implicit acoustic waves and SGS fluxes ### Internal Chagnes - Add a collection of "unrolled functions", whose return values can be inferred during compilation if their input values are all singleton types. - These are all implemented as combinations of `unrolled_zip`, `unrolled_map`, and `unrolled_foldl`. - Several of these need to have their recursion limits disabled for the unit tests to be type stable. - Add the `FieldNameTree` struct, which stores every `FieldName` that can be used to access `x` with `get_field(x, name)`. - A `name` can be checked for validity by calling `has_subtree_at_name(tree, name)`. - The children of `name` (the `FieldName`s that can be used to access the properties of `get_field(x, name)`) can be obtained by calling `child_names(name, tree)`. - Add the `FieldNameSet` struct, which stores a set of `FieldVectorKeys` (each of which is a `FieldName`) or a set of `FieldMatrixKeys` (each of which is an `NTuple{2, FieldName}`). - Roughly equivalent to the built-in `KeySet` for `AbstractDict`s, but specialized for `FieldNameDict`s. - Supports standard `AbstractSet` functions like `union` and `setdiff`, as well as custom functions like `set_complement` and `matrix_product_keys`. - Handles overlaps between `FieldName`s (that is, situations where one property of `x` lies inside another property of `x`) by storing a `FieldNameTree` that contains all available `FieldName`s. - Disable the recursion limits for several functions used to manipulate `FieldName`s, `FieldNameTree`s, and `FieldNameSet`s, as this is necessary in order for the unit tests to be type stable. - Remove the methods for `RecursiveApply.rmul` that specialize on `Number`, which is also necessary in order for the unit tests to be type stable. - These methods are no longer required, now that #1454 has been merged in. - Add support for calling `inv` on `BandMatrixRow`s. - Qualify the use of `CUDA.`@allowscalar`,` per Charlie's suggestion. - Fix some type instabilities in `matrix_field_test_utils.jl`. - Remove an unused variable name in `operator_matrices.jl`. Co-authored-by: Dennis Yatunin <dyatun@gmail.com>
Purpose
The interface we use to specify matrices and solve linear systems of equations for the implicit solver needs to be refactored and extended. This interface, which is currently spread across several files in ClimaCore and ClimaAtmos (
ClimaCore.jl/src/Operators/stencilcoefs.jl
,ClimaCore.jl/src/Operators/pointwisestencil.jl
,ClimaCore.jl/src/Operators/operator2stencil.jl
,ClimaAtmos.jl/src/tendencies/implicit/wfact.jl
, andClimaAtmos.jl/src/tendencies/implicit/schur_complement_W.jl
) is unnecessarily convoluted and involves a large number of hardcoded assumptions, which makes it extremely challenging for new users to experiment with the implicit solver and to add new implicit tendencies. In particular, the next stage of EDMF development will involve adding many implicit tendencies to the atmosphere model, and our approximation of the total implicit tendency's Jacobian matrix will end up with a highly non-trivial sparsity pattern. In order for more than a single developer to be able to understand and implement the implicit solver for the dycore+EDMF, we will first need to make the changes outlined in this SDI. In addition, the land modeling team has been experimenting with their own implicit solver, and these changes will speed up their development and add the new functionality they require.There is currently a draft PR that contains a detailed sketch of all the proposed changes: #1190
Cost/benefits/risks
The cost/risk is development time. The benefit will be a significantly reduced complexity of implicit solver implementations, both in ClimaAtmos and in ClimaLSM. There will be a simple, well-documented interface to all of the numerical algorithms required by implicit solvers, which will allow user-facing code to be relatively short and easily extensible.
Producers
@dennisYatunin
Components
Inputs
BandMatrixRow
andMultiplyColumnwiseBandMatrixField
The type we currently use to represent an element of a band matrix field is called
StencilCoefs
, and it is extremely confusing and poorly designed. Upon refactoring, this will becomeBandMatrixRow
, which will have the following improvements:DiagonalMatrixRow(1)
andTridiagonalMatrixRow(1, 2, 3)
BandMatrixRow
andLinearAlgebra.UniformScaling
, so that different types of matrices can be mixed together; e.g.,LinearAlgebra.I / 2 - TridiagonalMatrixRow(1, 2, 3) + 2 * PentadiagonalMatrixRow(1, 2, 3, 4, 5) == PentadiagonalMatrixRow(2, 3, 4.5, 5, 10)
These improvements will also be reflected in fields of
BandMatrixRow
s, which will be aliased asColumnwiseBandMatrixField
s for dispatch. (The alias name is meant to indicate that every column has its own set ofBandMatrixRow
s, which, when taken together, can be interpreted as a band matrix.) So, for example, users will be able to write(@. LinearAlgebra.I / 2 - tridiagonal_matrix_field + 2 * pentadiagonal_matrix_field) == (@. PentadiagonalMatrixRow(field1, field2, field3, field4, field5))
.The operators we currently use for matrix-matrix and matrix-vector multiplication are
Operators.ComposeStencils()
andOperators.ApplyStencil()
, respectively. Again, these are confusing and implemented in a rather roundabout way. Upon refactoring, these will both become⋅
, which is an alias forOperators.MultiplyColumnwiseBandMatrixField()
. This will allow users to write something like@. matrix_field1 ⋅ matrix_field2 ⋅ field
, instead of needing to write@. apply(compose(matrix_field1, matrix_field2), field)
. In addition, the amount of code used to implement matrix multiplication can be reduced roughly by a factor of 3 (as shown in the sketch), and this simplified code will be easier to update for GPUs in the near future.FiniteDifferenceOperatorTermsMatrix
When a finite difference operator is applied to a field (
@. op(field)
), the result is equivalent to multiplying some matrix by that field (@. matrix_field ⋅ field
). The operator we currently use to generate this matrix isOperators.Operator2Stencil(op)
; in order to clarify what this operator is doing, it will be renamed toOperators.FiniteDifferenceOperatorTermsMatrix(op)
. Ifop_matrix = Operators.FiniteDifferenceOperatorTermsMatrix(op)
andones_field = ones(axes(field))
, users will be able to confirm that(@. op(field)) == (@. op_matrix(ones_field) ⋅ field)
. As a quirk of our implementation, it is also the case that(@. op_matrix(ones_field) ⋅ field) == (@. op_matrix(field) ⋅ ones_field)
, which allows us to somewhat simplify expressions involving products with operator matrices.Aside from the name change, there are two new features that we need to add to
FiniteDifferenceOperatorTermsMatrix
. First, for EDMF development, we need to add support for multi-argument operators, so thatFiniteDifferenceOperatorTermsMatrix(op)
will always generate the matrix that corresponds to the last argument ofop
. For example, given a two-argument operatorop
(such asWeightedInterpolateF2C
orUpwind3rdOrderBiasedProductC2F
), users will be able to defineop_matrix
and confirm that(@. op(field1, field2)) == (@. op_matrix(field1, ones_field2) ⋅ field2) == (@. op_matrix(field1, field2) ⋅ ones_field2)
.Second, for land model development, we need to add support for specifying the corner elements of matrices by adding special boundary conditions for
FiniteDifferenceOperatorTermsMatrix
. The simple expression presented earlier,(@. op(field)) == (@. op_matrix(ones_field) ⋅ field)
, is only true whenop
has trivial boundary conditions; i.e., whenop
is a center-to-face operator with boundary conditions that cause it to return 0 on the top and bottom faces, or whenop
is a face-to-center operator without any boundary conditions (which means that it uses the values at the top and bottom faces as-is). In general, operators are affine transformations at the boundaries, not linear transformations. This means that, for everyop
andfield
, there is someboundary_field
that is zero everywhere except at the boundaries, and(@. op(field)) == (@. op_matrix(ones_field) ⋅ field + boundary_field)
. However, we only use matrices to represent derivatives of operators with respect to their inputs, and, up until now, it has always been the case thatboundary_field
is a constant that does not depend onfield
. More specifically, ifboundary_matrix_field
represents∂(boundary_field)/∂(field)
, then∂(@. op(field))/∂(field)
can be expressed as@. op_matrix(ones_field) + boundary_matrix_field
. So, ifboundary_field
is a constant, thenboundary_matrix_field
is the zero matrix and can be ignored in our computations. This amounts to assuming that the corner elements of our matrices are always zero. Due to new requirements from the land model, we will no longer be able to make this assumption, so we will need to add support for theOperators.SetValue
boundary condition forFiniteDifferenceOperatorTermsMatrix
, which will allow users to specify nonzero elements fromboundary_matrix_field
that should be added to@. op_matrix(ones_field)
.Additionally, it would be good to refactor how finite difference operators are implemented so that the code for every operator does not need to be duplicated in order to implement the
FiniteDifferenceOperatorTermsMatrix
for that operator. Currently, every operator implements a method forstencil_interior
,stencil_left_boundary
, andstencil_right_boundary
, and each of these methods returns some value (the value of the operator's result at a particular point in space). TheFiniteDifferenceOperatorTermsMatrix
for every operator also implements the same three methods, but each of its methods returns aBandMatrixRow
(or, rather, a "StencilCoefs
") whose entries add up to the value returned by the operator's corresponding method. There are currently almost 700 lines of code required to implement these duplicated methods, and more will need to be added in order to support multi-argument operators. It should be fairly straightforward to refactor things so that:stencil_interior
,stencil_left_boundary
, andstencil_right_boundary
that get implemented for every operator return aBandMatrixRow
.FiniteDifferenceOperatorTermsMatrix
, thisBandMatrixRow
gets returned as-is. If matrix boundary conditions are specified, the first or last value of thisBandMatrixRow
may need to be modified.FiniteDifferenceOperatorTermsMatrix
, the entries of thisBandMatrixRow
get added together before being returned.However, this last change would only improve things "under the hood", without any immediate benefit to users. In order to avoid unnecessarily delaying EDMF and land model development, this change will be the last step of this SDI.
ColumnwiseBlockMatrix
The type we currently use to represent block matrices is the
SchurComplementW
object, which is hardcoded to only work for the dycore inClimaAtmos
. This can be generalized to aColumnwiseBlockMatrix
, which will be a simple dictionary that maps pairs of field names (one name for the row and another name for the column) to their corresponding matrix blocks, each of which can be aColumnwiseBandMatrixField
or aLinearAlgebra.UniformScaling
. TheColumnwiseBlockMatrix
will be used as follows:The only challenge in implementing the
ColumnwiseBlockMatrix
is ensuring type stability, particularly when it gets used in aBlockMatrixSystemSolver
(see the next section); without type stability, we will have long compilation times and unnecessary allocations. Fortunately, this has already been worked out and tested in the sketch. In particular, the@block_name
macro will return a type-stable generalization of what we currently call aproperty_chain
for both the row and column names, and the corresponding row and column fields will be accessed by using a type-stable generalization ofFields.single_field
.BlockMatrixSystemSolver
We are currently solving the linear system of equations specified by the
SchurComplementW
object by reducing it to a smaller tridiagonal system of equations, solving the reduced problem, and using the reduced problem's solution to compute the original problem's solution. However, this strategy will not work when we add the new implicit EDMF tendencies because the new sparsity pattern of the matrix will not allow us to specify the reduced problem without performing a computationally expensive dense matrix inversion. So, we will need to implement a new algorithm for solving sparse block matrix systems. Per @simonbyrne's advice, this algorithm will work as follows:ColumnwiseBlockMatrix
so that, instead of blocks that correspond to pairs of variables, we end up with blocks that correspond to pairs of cells.To illustrate how this permutation will work, here is a simplified example that illustrates how we would permute the Jacobian of a$Y$ with total implicit tendency $Y_t$ that is defined on a single column with two cells, with a field $c$ defined on cell centers and a field $f$ defined on cell faces:
FieldVector
As this example illustrates, the permutation requires us to drop all of the matrix elements that correspond to the top or bottom cell face from the linear solve, which we can do as long as the only nonzero matrix element for that cell face is the "identity element". In ClimaAtmos, we know that this will always be the case for the top cell face; in the example above, the "identity element" for this cell face is$\partial Y_t.f[2.5]/\partial Y.f[2.5]$ . In our code, the value of $\partial Y_t.f[2.5]/\partial Y.f[2.5]$ (or, rather, the value of $-1$ , so, using the somewhat hand-wavy notation from above, we will typically have that $\partial Y_t.\square/\partial Y.f[2.5] = \partial Y_t.f[2.5]/\partial Y.\square = -\delta_{\square, f[2.5]}$ . In ClimaLSM, there are currently no prognostic variables defined on cell faces, but, if there were, this would be the case for the bottom cell face, rather than the top cell face. In general, as long as we deal with the nonzero identity element separately, we can drop all of the matrix elements related to one cell face from the linear solve. There are also two other types of variables whose corresponding matrix elements we will be able to drop from the linear solve (because the only nonzero elements will be the identity elements): variables that do not have a nonzero implicit tendency, like
-1 + Δtγ * ∂(Yₜ.f.w.components.data.:1[2.5])/∂(Y.f.w.components.data.:1[2.5])
, and the corresponding EDMF updraft velocity terms) will typically beY.c.uₕ
in ClimaAtmos, and variables that do not lie on cell centers or on cell faces, like those related to the river model in ClimaLSM.After the matrix is permuted, the new linear system will be solved using a band matrix solver. Both the dycore and the land model will require a tridiagonal solver, and EDMF may also require a pentadiagonal solver. If$N$ denotes the number of cells and $V$ denotes the number of variables, then the band matrix solver will be applied to a matrix of $N \times N$ blocks, where each block is itself a $V \times V$ matrix. To simplify our initial implementation, we can treat each block as a dense matrix, which we will represent using a $V$ is relatively small. If we were to instead modify the current "Schur complement solve" algorithm for EDMF, we would end up needing to perform a dense matrix inversion on a large block matrix, where each block is one of the original $N \times N$ blocks that represent pairs of variables. Since it will generally be the case that $N > V$ , this means that the new "permuted band matrix solve" algorithm will be significantly more performant for EDMF than the current algorithm.
StaticArrays.SMatrix
. Although this means that we will need to perform dense matrix inversions, these shouldn't be too expensive as long asUnfortunately, it will not be possible to eliminate dense matrix inversions from the new algorithm altogether by specializing on the sparsity structure of the$V \times V$ blocks. Even though each $V \times V$ block in ClimaAtmos will be an arrowhead matrix (both for the dycore and for EDMF), the band matrix solver will need to evaluate linear combinations of products of the blocks and their inverses, which will not have any particularly nice sparsity structure. Specifically, the inverse of an arrowhead matrix is a diagonal-plus-rank-one (DPR1) matrix, and all of the following matrix-matrix products are neither arrowhead nor DPR1 matrices: arrowhead times arrowhead, DPR1 times DPR1, and arrowhead times DPR1 (unless they are inverses of each other). This means that the new algorithm is likely to be slower than the current one for the dycore, since the current one does not involve any dense matrix inversions. So, in addition to implementing the new algorithm for
BlockMatrixSystemSolver
, we will also need to port over the algorithm currently implemented inClimaAtmos.jl/src/tendencies/implicit/schur_complement_W.jl
. This will require updating the current algorithm to use the new interface forBandMatrixRow
andMultiplyColumnwiseBandMatrixField
, and it will also require allowing theBlockMatrixSystemSolver
to determine which of the two algorithms it should use based on the type of theColumnwiseBlockMatrix
given to it.Task breakdown
PR: Add new MatrixFields module, along with unit tests and performance tests #1326
Estimated Completion Date: July 10th
BandMatrixRow
andMultiplyColumnwiseBandMatrixField
. Ensure that these objects have good docstrings.BandMatrixRow
andMultiplyColumnwiseBandMatrixField
in CI. These tests should ensure correctness, GPU compatibility, and type stability (i.e., no unexpected allocations). The tests should also be helpful for performance analysis.FiniteDifferenceOperatorTermsMatrix
. Add a helpful docstring, and add more unit tests to the test suite.PR: Add operator_matrix to MatrixFields, along with tests and docs #1399
Estimated Completion Date: July 31st
ColumnwiseBlockMatrix
andBlockMatrixSystemSolver
, but only with the current "Schur complement solve" algorithm from ClimaAtmos. Add docstrings and unit tests.PR: Add FieldMatrix and linear solvers #1436
Estimated Completion Date: July 25th
PR: Update to new implicit solver interface ClimaAtmos.jl#2044
Estimated Completion Date: August 4th
Add the new "permuted band matrix solve" algorithm toBlockMatrixSystemSolver
. Expand the docstring and add unit tests.Skipping this in favor of JFNK with clever preconditioning.Skipping this in favor of the
SchurComplementReductionSolve
algorithm in conjunction with aBlockArrowheadSchurComplementPreconditioner
.PR: Add matrix-free iterative solver and fix bugs in FieldNameSet #1551
Estimated Completion Date: TBD
Estimated Completion Date: TBD
Reviewers
The text was updated successfully, but these errors were encountered: