diff --git a/docs/building/handling-missing-values.rst b/docs/building/handling-missing-values.rst
index adc07cba..7ee2f127 100644
--- a/docs/building/handling-missing-values.rst
+++ b/docs/building/handling-missing-values.rst
@@ -2,16 +2,19 @@
  Handling missing values
 #########################
 
-When handling data for machine learning models, missing values (NaNs)
+When handling data for machine learning models, missing values (`NaNs`)
 can pose a challenge, as models require complete data to operate
 effectively and may crash otherwise. Ideally, we anticipate having
-complete data in all fields. However, there are scenarios where NaNs
-naturally occur, such as with variables only relevant on land or at sea
-(such as sea surface temperature (`sst`), for example). In such cases,
-the default behavior is to reject data with NaNs as invalid. To
-accommodate NaNs and accurately compute statistics based on them, you
-can include the `allow_nans` key in the configuration. Here's an example
-of how to implement it:
+complete data in all fields.
+
+However, there are scenarios where `NaNs` naturally occur, such as with
+variables only relevant on land or at sea. This happens for sea surface
+temperature (`sst`), for example. In such cases, the default behavior is
+to reject data with `NaNs` as invalid. To accommodate `NaNs` and
+accurately compute statistics based on them, you can include the
+``allow_nans`` key in the configuration.
+
+Here's an example of how to implement it:
 
 .. literalinclude:: yaml/nan.yaml
    :language: yaml
diff --git a/docs/building/sources.rst b/docs/building/sources.rst
index 99fb8415..3e5e8aee 100644
--- a/docs/building/sources.rst
+++ b/docs/building/sources.rst
@@ -4,6 +4,17 @@
  Sources
 #########
 
+The source is a software component that given a list of dates and
+variables will return the corresponding fields.
+
+A `source` is responsible for reading data from the source and
+converting it to a set of fields. A `source` is also responsible for
+handling the metadata of the data, such as the variables names, and
+more.
+
+A example of source is ECMWF’s MARS archive, a collection of GRIB or
+NetCDF files, etc.
+
 The following `sources` are currently available:
 
 .. toctree::
diff --git a/docs/building/statistics.rst b/docs/building/statistics.rst
index 2d6915dd..4c561ee3 100644
--- a/docs/building/statistics.rst
+++ b/docs/building/statistics.rst
@@ -8,17 +8,20 @@
 it is created. These statistics are intended to be used to normalise the
 data during training.
 
-The statistics are stored in the `statistics` attribute of the dataset.
-The computed statistics include:
+The statistics are stored in the :doc:`statistics attribute
+<../using/statistics>` of the dataset. The computed statistics include
+`minimum, maximum, mean, standard deviation`.
 
--  Minimum
--  Maximum
--  Mean
--  Standard deviation
+************************
+ Statistics dates range
+************************
 
 By defaults, the statistics are not computed on the whole dataset, but
-on a subset of dates. The subset is defined using the following
-algorithm:
+on a subset of dates. This usually is done to avoid any data leakage
+from the validation and test sets to the training set.
+
+The dates subset used to compute the statistics is defined using the
+following algorithm:
 
    -  If the dataset covers 20 years or more, the last 3 years are
       excluded.
@@ -51,3 +54,12 @@ Example configuration gathering statistics using only 2020 data :
    statistics:
        start: 2020
        end: 2020
+
+**************************
+ Data with missing values
+**************************
+
+If the dataset contains missing values (known as `NaNs`), an error will
+be raised when trying to compute the statistics. To allow `NaNs` in the
+dataset, you can set the `allow_nans` as described :doc:`here
+</building/handling-missing-values>`.
diff --git a/docs/cli/compare.rst b/docs/cli/compare.rst
index be4d0252..f8604293 100644
--- a/docs/cli/compare.rst
+++ b/docs/cli/compare.rst
@@ -1,7 +1,15 @@
 compare
 =======
 
-Use this command to compatre two datasets:
+Use this command to compare two datasets.
+
+The command will run a quick comparison of the two datasets and output a summary of the differences.
+
+.. warning::
+
+    This command will not compare the data in the datasets, only some of the metadata.
+    Subsequent versions of this command may include more detailed comparisons.
+
 
 .. argparse::
     :module: anemoi.datasets.__main__
diff --git a/docs/cli/copy.rst b/docs/cli/copy.rst
index 67394267..9413c1ec 100644
--- a/docs/cli/copy.rst
+++ b/docs/cli/copy.rst
@@ -16,7 +16,7 @@ The chunk pattern for the source dataset has been defined for good reasons, and
 
 .. warning::
 
-    When resuming the copying process (using ``--resume``), calling the script with the same arguments for --block-size and --rechunk is recommended.
+    When resuming the copying process (using ``--resume``), calling the script with the same arguments for ``--block-size`` and ``--rechunk`` is recommended.
     Using different values for these arguments to resume copying the same dataset may lead to unexpected behavior.
 
 
diff --git a/docs/cli/inspect.rst b/docs/cli/inspect.rst
index d07ad201..1c8876fb 100644
--- a/docs/cli/inspect.rst
+++ b/docs/cli/inspect.rst
@@ -2,9 +2,24 @@ inspect
 =======
 
 
+Anemoi datasets are stored in a zarr format and can be located on a local file system or on a remote server.
+The `inspect` command is used to inspect the contents of a dataset.
+This command will output the metadata of the dataset, including the variables, dimensions, and attributes.
+
+.. code:: console
+
+   $ anemoi-datasets inspect dataset.zarr
+
+
+which will output something like the following. The output should be self-explanatory.
+
 .. literalinclude:: ../building/yaml/building1.txt
    :language: console
 
+*********************
+ Command line usage
+*********************
+
 .. argparse::
     :module: anemoi.datasets.__main__
     :func: create_parser
diff --git a/pyproject.toml b/pyproject.toml
index 034579ec..e3d15ec4 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -108,7 +108,6 @@ optional-dependencies.remote = [
 urls.Documentation = "https://anemoi-datasets.readthedocs.io/"
 urls.Homepage = "https://github.com/ecmwf/anemoi-datasets/"
 urls.Issues = "https://github.com/ecmwf/anemoi-datasets/issues"
-
 # Changelog = "https://github.com/ecmwf/anemoi-datasets/CHANGELOG.md"
 urls.Repository = "https://github.com/ecmwf/anemoi-datasets/"
 scripts.anemoi-datasets = "anemoi.datasets.__main__:main"