Use new rst formater

ecmwf · May 26, 2024 · fd8aa56 · fd8aa56
1 parent 40d6fd4
commit fd8aa56
Show file tree

Hide file tree

Showing 53 changed files with 897 additions and 1,096 deletions.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -57,11 +57,12 @@ repos:
   hooks:
     - id: sphinx-lint
 
-# For now, we use it. But it does not support a lot of sphinx features
-# - repo: https://github.com/dzhu/rstfmt
-#   rev: v0.0.14
-#   hooks:
-#     - id: rstfmt
+- repo: https://github.com/LilSpazJoekp/docstrfmt
+  rev: v1.6.1
+  hooks:
+    - id: docstrfmt
+      language_version: python3
+      types_or: [rst] # Don't touch python docstrings.
 
 - repo: https://github.com/b8raoult/pre-commit-docconvert
   rev: "0.1.4"

diff --git a/docs/building/filters.rst b/docs/building/filters.rst
@@ -1,22 +1,20 @@
 .. _filters:
 
-#########
- Filters
-#########
+Filters
+=======
 
 .. warning::
 
-   This is still a work in progress. Some of the filters may be renamed
-   later.
+    This is still a work in progress. Some of the filters may be renamed later.
 
 Filters are used to modify the data or metadata in a dataset.
 
 .. toctree::
-   :maxdepth: 1
+    :maxdepth: 1
 
-   filters/select
-   filters/rename
-   filters/rotate_winds
-   filters/unrotate_winds
-   filters/noop
-   filters/empty
+    filters/select
+    filters/rename
+    filters/rotate_winds
+    filters/unrotate_winds
+    filters/noop
+    filters/empty
diff --git a/docs/building/filters/empty.rst b/docs/building/filters/empty.rst
@@ -1,6 +1,5 @@
-#######
- empty
-#######
+empty
+=====
 
-The ``empty`` filter is for debugging purposes. It always returns an
-empty set of fields.
+The ``empty`` filter is for debugging purposes. It always returns an empty set of
+fields.
diff --git a/docs/building/filters/noop.rst b/docs/building/filters/noop.rst
@@ -1,6 +1,4 @@
-######
- noop
-######
+noop
+====
 
-The ``noop`` filter is for debugging purposes. It returns its input
-unchanged.
+The ``noop`` filter is for debugging purposes. It returns its input unchanged.
diff --git a/docs/building/filters/rename.rst b/docs/building/filters/rename.rst
@@ -1,3 +1,2 @@
-########
- rename
-########
+rename
+======
diff --git a/docs/building/filters/rotate_winds.rst b/docs/building/filters/rotate_winds.rst
@@ -1,3 +1,2 @@
-##############
- rotate_winds
-##############
+rotate_winds
+============
diff --git a/docs/building/filters/select.rst b/docs/building/filters/select.rst
@@ -1,3 +1,2 @@
-########
- select
-########
+select
+======
diff --git a/docs/building/filters/unrotate_winds.rst b/docs/building/filters/unrotate_winds.rst
@@ -1,3 +1,2 @@
-###############
- unrotate_wind
-###############
+unrotate_wind
+=============
diff --git a/docs/building/handling-missing-dates.rst b/docs/building/handling-missing-dates.rst
@@ -1,24 +1,22 @@
-########################
- Handling missing dates
-########################
+Handling missing dates
+======================
 
 By default, the package will raise an error if there are missing dates.
 
-Missing dates can be handled by specifying a list of dates in the
-configuration file. The dates should be in the same format as the dates
-in the time series. The missing dates will be filled ``np.nan`` values.
+Missing dates can be handled by specifying a list of dates in the configuration file.
+The dates should be in the same format as the dates in the time series. The missing
+dates will be filled ``np.nan`` values.
 
 .. literalinclude:: yaml/missing_dates.yaml
-   :language: yaml
+    :language: yaml
 
-*Anemoi* will ignore the missing dates when computing the
-:ref:`statistics <gathering_statistics>`.
+*Anemoi* will ignore the missing dates when computing the :ref:`statistics
+<gathering_statistics>`.
 
-You can retrieve the list indices corresponding to the missing dates by
-accessing the ``missing`` attribute of the dataset object.
+You can retrieve the list indices corresponding to the missing dates by accessing the
+``missing`` attribute of the dataset object.
 
 .. literalinclude:: ../using/code/missing_.py
-   :language: python
+    :language: python
 
-If you access a missing index, the dataset will throw a
-``MissingDateError``.
+If you access a missing index, the dataset will throw a ``MissingDateError``.
diff --git a/docs/building/handling-missing-values.rst b/docs/building/handling-missing-values.rst
@@ -1,17 +1,14 @@
-#########################
- Handling missing values
-#########################
+Handling missing values
+=======================
 
-When handling data for machine learning models, missing values (NaNs)
-can pose a challenge, as models require complete data to operate
-effectively and may crash otherwise. Ideally, we anticipate having
-complete data in all fields. However, there are scenarios where NaNs
-naturally occur, such as with variables only relevant on land or at sea
-(such as sea surface temperature (`sst`), for example). In such cases,
-the default behavior is to reject data with NaNs as invalid. To
-accommodate NaNs and accurately compute statistics based on them, you
-can include the `allow_nans` key in the configuration. Here's an example
-of how to implement it:
+When handling data for machine learning models, missing values (NaNs) can pose a
+challenge, as models require complete data to operate effectively and may crash
+otherwise. Ideally, we anticipate having complete data in all fields. However, there are
+scenarios where NaNs naturally occur, such as with variables only relevant on land or at
+sea (such as sea surface temperature (`sst`), for example). In such cases, the default
+behavior is to reject data with NaNs as invalid. To accommodate NaNs and accurately
+compute statistics based on them, you can include the `allow_nans` key in the
+configuration. Here's an example of how to implement it:
 
 .. literalinclude:: yaml/nan.yaml
-   :language: yaml
+    :language: yaml
diff --git a/docs/building/introduction.rst b/docs/building/introduction.rst
@@ -1,154 +1,138 @@
 .. _building-introduction:
 
-##############
- Introduction
-##############
-
-The `anemoi-datasets` package allows you to create datasets for training
-data-driven weather models. The datasets are built using a `recipe`
-file, which is a YAML file that describes sources of meteorological
-fields as well as the operations to perform on them, before they are
-written to a zarr file. The input of the process is a range of dates and
-some options to control the layout of the output. Statistics will be
-computed as the dataset is build, and stored in the metadata, with other
-information such as the the locations of the grid points, the list of
-variables, etc.
+Introduction
+============
+
+The `anemoi-datasets` package allows you to create datasets for training data-driven
+weather models. The datasets are built using a `recipe` file, which is a YAML file that
+describes sources of meteorological fields as well as the operations to perform on them,
+before they are written to a zarr file. The input of the process is a range of dates and
+some options to control the layout of the output. Statistics will be computed as the
+dataset is build, and stored in the metadata, with other information such as the the
+locations of the grid points, the list of variables, etc.
 
 .. figure:: ../schemas/recipe.png
-   :alt: Building datasets
-   :align: center
+    :alt: Building datasets
+    :align: center
 
-**********
- Concepts
-**********
+Concepts
+--------
 
 date
-   Throughout this document, the term `date` refers to a date and time,
-   not just a date. A training dataset is covers a continuous range of
-   dates with a given frequency. Missing dates are still part of the
-   dataset, but the data are missing and marked as such using NaNs.
-   Dates are always in UTC, and refer to date at which the data is
-   valid. For accumulations and fluxes, that would be the end of the
-   accumulation period.
+    Throughout this document, the term `date` refers to a date and time, not just a
+    date. A training dataset is covers a continuous range of dates with a given
+    frequency. Missing dates are still part of the dataset, but the data are missing and
+    marked as such using NaNs. Dates are always in UTC, and refer to date at which the
+    data is valid. For accumulations and fluxes, that would be the end of the
+    accumulation period.
 
 variable
-   A `variable` is meteorological parameter, such as temperature, wind,
-   etc. Multilevel parameters are treated as separate variables, one for
-   each level. For example, temperature at 850 hPa and temperature at
-   500 hPa will be treated as two separate variables (`t_850` and
-   `t_500`).
+    A `variable` is meteorological parameter, such as temperature, wind, etc. Multilevel
+    parameters are treated as separate variables, one for each level. For example,
+    temperature at 850 hPa and temperature at 500 hPa will be treated as two separate
+    variables (`t_850` and `t_500`).
 
 field
-   A `field` is a variable at a given date. It is represented by a array
-   of values at each grid point.
+    A `field` is a variable at a given date. It is represented by a array of values at
+    each grid point.
 
 source
-   The `source` is a software component that given a list of dates and
-   variables will return the corresponding fields. A example of source
-   is ECMWF's MARS archive, a collection of GRIB or NetCDF files, a
-   database, etc. See :ref:`sources` for more information.
+    The `source` is a software component that given a list of dates and variables will
+    return the corresponding fields. A example of source is ECMWF's MARS archive, a
+    collection of GRIB or NetCDF files, a database, etc. See :ref:`sources` for more
+    information.
 
 filter
-   A `filter` is a software component that takes as input the output of
-   a source or the output of another filter can modify the fields and/or
-   their metadata. For example, typical filters are interpolations,
-   renaming of variables, etc. See :ref:`filters` for more information.
+    A `filter` is a software component that takes as input the output of a source or the
+    output of another filter can modify the fields and/or their metadata. For example,
+    typical filters are interpolations, renaming of variables, etc. See :ref:`filters`
+    for more information.
 
-************
- Operations
-************
+Operations
+----------
 
-In order to build a training dataset, sources and filters are combined
-using the following operations:
+In order to build a training dataset, sources and filters are combined using the
+following operations:
 
 join
-   The join is the process of combining several sources data. Each
-   source is expected to provide different variables at the same dates.
+    The join is the process of combining several sources data. Each source is expected
+    to provide different variables at the same dates.
 
 pipe
-   The pipe is the process of transforming fields using filters. The
-   first step of a pipe is typically a source, a join or another pipe.
-   The following steps are filters.
+    The pipe is the process of transforming fields using filters. The first step of a
+    pipe is typically a source, a join or another pipe. The following steps are filters.
 
 concat
-   The concatenation is the process of combining different sets of
-   operation that handle different dates. This is typically used to
-   build a dataset that spans several years, when the several sources
-   are involved, each providing a different period.
+    The concatenation is the process of combining different sets of operation that
+    handle different dates. This is typically used to build a dataset that spans several
+    years, when the several sources are involved, each providing a different period.
 
-Each operation is considered as a :ref:`source <sources>`, therefore
-operations can be combined to build complex datasets.
+Each operation is considered as a :ref:`source <sources>`, therefore operations can be
+combined to build complex datasets.
 
-*****************
- Getting started
-*****************
+Getting started
+---------------
 
 First recipe
-============
+~~~~~~~~~~~~
 
-The simplest `recipe` file must contain a ``dates`` section and an
-``input`` section. The latter must contain a `source` In that case, the
-source is ``mars``
+The simplest `recipe` file must contain a ``dates`` section and an ``input`` section.
+The latter must contain a `source` In that case, the source is ``mars``
 
 .. literalinclude:: yaml/building1.yaml
-   :language: yaml
+    :language: yaml
 
 To create the dataset, run the following command:
 
-.. code:: console
+.. code-block:: console
 
-   $ anemoi-datasets create recipe.yaml dataset.zarr
+    $ anemoi-datasets create recipe.yaml dataset.zarr
 
-Once the build is complete, you can inspect the dataset using the
-following command:
+Once the build is complete, you can inspect the dataset using the following command:
 
-.. code:: console
+.. code-block:: console
 
-   $ anemoi-datasets inspect dataset.zarr
+    $ anemoi-datasets inspect dataset.zarr
 
 .. literalinclude:: yaml/building1.txt
-   :language: console
+    :language: console
 
 Adding a second source
-======================
+~~~~~~~~~~~~~~~~~~~~~~
 
-To add a second source, you need to use the ``join`` operation. In that
-example, we add pressure level variables to the previous example:
+To add a second source, you need to use the ``join`` operation. In that example, we add
+pressure level variables to the previous example:
 
 .. literalinclude:: yaml/building2.yaml
-   :language: yaml
+    :language: yaml
 
 This will build the following dataset:
 
 .. literalinclude:: yaml/building2.txt
-   :language: console
+    :language: console
 
 .. note::
 
-   Please note that the pressure levels parameters are named
-   `param_level`. This is the default behaviour. See
-   :ref:`remapping_option` for more information.
+    Please note that the pressure levels parameters are named `param_level`. This is the
+    default behaviour. See :ref:`remapping_option` for more information.
 
 Adding some forcing variables
-=============================
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-When training a data-driven models, some forcing variables may be
-required such as the solar radiation, the time of day, the day in the
-year, etc.
+When training a data-driven models, some forcing variables may be required such as the
+solar radiation, the time of day, the day in the year, etc.
 
-These are provided by the ``forcings`` source. In that example, we add a
-few of them. The `template` option is used to point to another source,
-in that case the first instance of ``mars``. This source is used to get
-information about the grid points, as some of the forcing variables are
-grid dependent.
+These are provided by the ``forcings`` source. In that example, we add a few of them.
+The `template` option is used to point to another source, in that case the first
+instance of ``mars``. This source is used to get information about the grid points, as
+some of the forcing variables are grid dependent.
 
 .. literalinclude:: yaml/building3.yaml
-   :language: yaml
+    :language: yaml
 
 This will build the following dataset:
 
 .. literalinclude:: yaml/building3.txt
-   :language: console
+    :language: console
 
-See :ref:`forcing_variables` for more information about forcing
-variables.
+See :ref:`forcing_variables` for more information about forcing variables.