ecmwf · JPXKQX · Nov 25, 2024 · Nov 25, 2024 · Nov 25, 2024 · Nov 26, 2024
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -10,7 +10,9 @@ Keep it human-readable, your future self will thank you!
 
 ## [Unreleased](https://github.com/ecmwf/anemoi-graphs/compare/0.4.1...HEAD)
 
-# Changed
+### Changed
+
+- docs: Documentation structure (#84)
 - fix: faster edge builder for tri icosahedron. (#92)
 
 ## [0.4.1 - ICON graphs, multiple edge builders and post processors](https://github.com/ecmwf/anemoi-graphs/compare/0.4.0...0.4.1) - 2024-11-26
@@ -25,7 +27,7 @@ Keep it human-readable, your future self will thank you!
 - feat: Support for providing lon/lat coordinates from a text file (loaded with numpy loadtxt method) to build the graph `TextNodes` (#93)
 - feat: Build 2D graphs with `Voronoi` in case `SphericalVoronoi` does not work well/is an overkill (LAM). Set `flat=true` in the nodes attributes to compute area weight using Voronoi with a qhull options preventing the empty region creation (#93)
 
-# Changed
+### Changed
 
 - fix: bug when computing area weights with scipy.Voronoi. (#79)
 

diff --git a/docs/cli/create.rst b/docs/cli/create.rst
diff --git a/docs/cli/describe.rst b/docs/cli/describe.rst
diff --git a/docs/cli/inspect.rst b/docs/cli/inspect.rst
diff --git a/docs/cli/introduction.rst b/docs/cli/introduction.rst
@@ -1,29 +1,78 @@
 .. _cli-introduction:
 
-=============
-Introduction
-=============
+=================
+Command line tool
+=================
 
-When you install the `anemoi-graphs` package, this will also install command line tool
-called ``anemoi-graphs`` which can be used to design and inspect weather graphs.
+When you install the `anemoi-graphs` package, a command line tool will also be installed
+called ``anemoi-graphs``, which can be used to build graphs based on YAML recipe files,
+and inspect existing graphs.
 
 The tool can provide help with the ``--help`` options:
 
 .. code-block:: bash
 
     % anemoi-graphs --help
 
-The commands are:
+To **create** a graph, use the ``create`` command:
 
-.. toctree::
-    :maxdepth: 1
+.. code:: console
 
-    create
-    describe
-    inspect
+   $ anemoi-graphs create recipe.yaml graph.pt
 
+The ``.yaml`` recipe file consists of high-level specifications for generating the graphs at each
+layer. An example of a simple recipe file is given in the :ref:`the following section <usage-getting-started>`.
+
+The ``create`` command will read the specifications in the ``recipe.yaml`` recipe file, and write to a PyTorch
+``.pt`` file.
+
+To **describe** an existing graph stored as a ``.pt`` file, use the ``describe`` command:
+
+.. code:: console
+
+   $ anemoi-graphs describe graph.pt
+
+This will generate a text summary of the graph, including the number of nodes and edges
+at each layer, the geographic boundaries, and statistics about the edge lengths:
+
+.. literalinclude:: ../usage/yaml/global_wo-proc.txt
+   :language: console
+
+A set of interactive and static visualisations are generated to allow visual inspection
+of the graph design.
+
+Finally, the ``inspect`` command will generate a set of interactive and static visualisations
+for visual inspection of the graph design:
+
+.. code:: console
+
+   $ anemoi-graphs inspect graph.pt output_folder/
+
+===================
+Command line usage
+===================
+
+Create Command
+--------------
+
+.. argparse::
+    :module: anemoi.graphs.__main__
+    :func: create_parser
+    :prog: anemoi-graphs
+    :path: create
+
+Describe Command
+----------------
+.. argparse::
+    :module: anemoi.graphs.__main__
+    :func: create_parser
+    :prog: anemoi-graphs
+    :path: describe
+
+Inspect Command
+---------------
 .. argparse::
     :module: anemoi.graphs.__main__
     :func: create_parser
     :prog: anemoi-graphs
-    :nosubcommands:
+    :path: inspect
diff --git a/docs/graphs/edge_attributes.rst b/docs/graphs/edge_attributes.rst
@@ -4,8 +4,8 @@
  Edges - Attributes
 ####################
 
-There are 2 main edge attributes implemented in the `anemoi-graphs`
-package:
+There are two main edge attributes implemented in the
+:ref:`anemoi-graphs <anemoi-graphs:index-page>` package:
 
 *************
  Edge length

diff --git a/docs/graphs/edges/cutoff.rst b/docs/graphs/edges/cutoff.rst
@@ -1,9 +1,11 @@
+.. _cutoff_radius:
+
 ################
  Cut-off radius
 ################
 
-The cut-off method is a method for establishing connections between two
-sets of nodes. Given two sets of nodes, (`source`, `target`), the
+The *cut-off method* is a method for establishing connections between
+two sets of nodes. Given two sets of nodes, (`source`, `target`), the
 cut-off method connects all source nodes, :math:`V_{source}`, in a
 neighbourhood of the target nodes, :math:`V_{target}`.
 
@@ -16,20 +18,20 @@ as,
 
 .. math::
 
-   cutoff\_radius = cuttoff\_factor \times nodes\_reference\_dist
+   \text{cutoff_radius} = \text{cutoff_factor} \times \text{nodes_reference_dist}
 
-where :math:`nodes\_reference\_dist` is the maximum distance between a
-target node and its nearest source node.
+where :math:`\text{nodes_reference_dist}` is the maximum distance
+between a target node and its nearest source node.
 
 .. math::
 
-   nodes\_reference\_dist = \max_{x \in V_{target}} \left\{  \min_{y \in V_{source}, y \neq x} \left\{ d(x, y) \right\} \right\}
+   \text{nodes_reference_dist} = \max_{x \in V_{target}} \left\{  \min_{y \in V_{source}, y \neq x} \left\{ d(x, y) \right\} \right\}
 
 where :math:`d(x, y)` is the `Haversine distance
 <https://en.wikipedia.org/wiki/Haversine_formula>`_ between nodes
-:math:`x` and :math:`y`. The ``cutoff_factor`` is a parameter that can
-be adjusted to increase or decrease the size of the neighbourhood, and
-consequently the number of connections in the graph.
+:math:`x` and :math:`y`. The :math:`\text{cutoff_factor}` is a parameter
+that can be adjusted to increase or decrease the size of the
+neighbourhood, and consequently the number of connections in the graph.
 
 To use this method to create your connections, you can use the following
 YAML configuration:

diff --git a/docs/graphs/edges/knn.rst b/docs/graphs/edges/knn.rst
@@ -1,11 +1,13 @@
+.. _knn:
+
 ######################
  K-Nearest Neighbours
 ######################
 
-The knn method is a method for establishing connections between two sets
-of nodes. Given two sets of nodes, (`source`, `target`), the knn method
-connects all target nodes, to their ``num_nearest_neighbours`` nearest
-source nodes.
+The k-nearest neighbours (KNN) method is a method for establishing
+connections between two sets of nodes. Given two sets of nodes,
+(`source`, `target`), the KNN method connects all target nodes to their
+``num_nearest_neighbours`` nearest source nodes.
 
 To use this method to build your connections, you can use the following
 YAML configuration:
@@ -21,5 +23,5 @@ YAML configuration:
 
 .. note::
 
-   The KNNEdges method is recommended for the decoder edges, to connect
-   all target nodes with the surrounding source nodes.
+   The ``KNNEdges`` method is recommended for the decoder edges, to
+   connect all target nodes with the surrounding source nodes.
diff --git a/docs/graphs/edges/multi_scale.rst b/docs/graphs/edges/multi_scale.rst
@@ -1,3 +1,5 @@
+.. _multi_scale:
+
 ################################################
  Multi-scale connections at refined icosahedron
 ################################################

diff --git a/docs/graphs/introduction.rst b/docs/graphs/introduction.rst
@@ -4,64 +4,41 @@
  Introduction
 ##############
 
-The `anemoi-graphs` package allows you to design custom graphs for
-training data-driven weather models. The graphs are built using a
-`recipe`, which is a YAML file that specifies the nodes and edges of the
-graph.
-
-**********
- Concepts
-**********
-
-nodes
-   A `node` represents a location (2D) on the earth's surface which may
-   contain additional `attributes`.
-
-data nodes
-   A set of nodes representing one or multiple datasets. The `data
-   nodes` may correspond to the input/output of our data-driven model.
-   They can be defined from Zarr datasets and this method supports all
-   :ref:`anemoi-datasets <anemoi-datasets:index-page>` operations such
-   as `cutout` or `thinning`.
-
-hidden nodes
-   The `hidden nodes` capture intermediate representations of the model,
-   which are used to learn the dynamics of the system considered
-   (atmosphere, ocean, etc, ...). These nodes can be generated from
-   existing locations (Zarr datasets or NPZ files) or algorithmically
-   from iterative refinements of polygons over the globe.
-
-isolated nodes
-   A set of nodes that are not connected to any other nodes in the
-   graph. These nodes can be used to store additional information that
-   is not directly used in the training process.
-
-edges
-   An `edge` represents a connection between two nodes. The `edges` can
-   be used to define the flow of information between the nodes. Edges
-   may also contain `attributes` related to their length, direction or
-   other properties.
-
-*****************
- Data structures
-*****************
-
-The nodes :math:`V` correspond to locations on the earth's surface, and
-they can be classified into 2 categories:
-
--  **Data nodes**: The `data nodes` represent the input/output of the
-   data-driven model, i.e. they are linked to existing datasets.
--  **Hidden nodes**: These `hidden nodes` represent the latent space,
-   where the internal dynamics are learned.
+In `anemoi-graphs`, graphs are built using a `recipe`, which is a YAML
+file that specifies the nodes, edges, and attributes of the graph. The
+recipe file is used to create the graphs using the :ref:`command-line
+tool <cli-introduction>`.
 
-Several methods are currently supported to create your nodes. You can
-use indistinctly any of these to create your `data` or `hidden` nodes.
+The main components of the recipe file are the definition of the
+`nodes`, the `edges`, and each of these can optionally include
+`attributes`. Here we give an overview of these specifications; more
+details on the available options for each are given in the following
+pages.
+
+*******
+ Nodes
+*******
 
 The `nodes` are defined in the ``nodes`` section of the recipe file. The
 keys are the names of the sets of `nodes` that will later be used to
 build the connections. Each `nodes` configuration must include a
-``node_builder`` section describing how to define the `nodes`. The
-following classes define different behaviour:
+``node_builder`` section describing how to define the `nodes`, and can
+optionally include node `attributes`.
+
+A simple example of the nodes definition, for two sets of nodes, looks
+like this:
+
+.. literalinclude:: ../usage/yaml/nodes.yaml
+   :language: yaml
+
+In this example, two sets of nodes are defined, which have been named
+``data`` and ``hidden`` (these names are later used to define edges).
+The ``data`` nodes have been defined based on coordinates specified in a
+:ref:`zarr file <zarr-file>` and the hidden nodes are specified based on
+a :ref:`triangular refined icosahedron <trinodes>` algorithm.
+
+Several methods are currently supported to create your nodes. You can
+use indistinctly any of these to create your `data` or `hidden` nodes:
 
 -  :doc:`node_coordinates/zarr_dataset`
 -  :doc:`node_coordinates/npz_file`
@@ -71,9 +48,52 @@ following classes define different behaviour:
 -  :doc:`node_coordinates/healpix`
 
 In addition to the ``node_builder`` section, the `nodes` configuration
-can contain an optional ``attributes`` section to define additional node
-attributes (weights, mask, ...). For example, the weights can be used to
-define the importance of each node in the loss function, or the masks
-can be used to build connections only between subsets of nodes.
-
--  :doc:`node_attributes/weights`
+can contain an optional section to define additional node attributes
+(weights, mask, ...). For example, the weights can be used to define the
+importance of each node in the loss function, or the masks can be used
+to build connections only between subsets of nodes. See the
+:ref:`attributes <graphs-node_attributes>` page for more details.
+
+*******
+ Edges
+*******
+
+The ``edges`` section in the recipe file defines the edges between the
+nodes through which information will flow. These connections are defined
+between pairs of `nodes` sets (source and target, specified by
+`source_name` and `target_name`). There are several methods to build
+these edges, including cutoff (`CutOffEdges`) or nearest neighbours
+(`KNNEdges`).
+
+For an encoder-processor-decoder graph you will need to build two sets
+of `edges`. The first set of edges will connect the `data` nodes with
+the `hidden` nodes to encode the input data into the latent space,
+normally referred to as the `encoder edges` and represented here by the
+first element of the ``edges`` section. The second set of `edges` will
+connect the `hidden` nodes with the `data` nodes to decode the latent
+space into the output data, normally referred to as `decoder edges` and
+represented here by the second element of the ``edges`` section.
+
+Graphically, the encoder-processor-decoder setup looks like this:
+
+.. figure:: ../usage/schemas/global_wo-proc.png
+   :alt: Schema of global graph (without processor connections)
+   :align: center
+   :width: 250
+
+The corresponding recipe file chunk is as follows:
+
+.. literalinclude:: ../usage/yaml/global_wo-proc.yaml
+   :language: yaml
+
+In this example, the encoder edges are defined based on the
+:ref:`cut-off radius <cutoff_radius>`, whereas the decoder nodes use a
+:ref:`k-nearest neighbours <knn>` algorithm. Available methods for
+defining edges are:
+
+-  :ref:`Cut-off radius <cutoff_radius>`
+-  :ref:`K-nearest neighbours <knn>`
+-  :ref:`Multi-scale connections <multi_scale>`
+
+As with the nodes, the edges can have additional attributes - see the
+:ref:`attributes <edge-attributes>` page for more details.