Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support ReducedGaussianGridNodes #54

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from
5 changes: 5 additions & 0 deletions graphs/docs/graphs/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,10 @@ following classes define different behaviour:

- :doc:`node_coordinates/zarr_dataset`
- :doc:`node_coordinates/npz_file`
- :doc:`node_coordinates/reduced_gaussian`
- :doc:`node_coordinates/icon_mesh`
- :doc:`node_coordinates/text_file`
- :doc:`node_coordinates/latlon_arrays`
- :doc:`node_coordinates/tri_refined_icosahedron`
- :doc:`node_coordinates/hex_refined_icosahedron`
- :doc:`node_coordinates/healpix`
Expand All @@ -77,3 +80,5 @@ define the importance of each node in the loss function, or the masks
can be used to build connections only between subsets of nodes.

- :doc:`node_attributes/weights`
- :doc:`node_attributes/zarr_dataset`
- :doc:`node_attributes/boolean_operations`
1 change: 1 addition & 0 deletions graphs/docs/graphs/node_coordinates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ a file:

node_coordinates/zarr_dataset
node_coordinates/npz_file
node_coordinates/reduced_gaussian
node_coordinates/icon_mesh
node_coordinates/text_file
node_coordinates/latlon_arrays
Expand Down
22 changes: 6 additions & 16 deletions graphs/docs/graphs/node_coordinates/npz_file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,10 @@ following YAML configuration:
data: # name of the nodes
node_builder:
_target_: anemoi.graphs.nodes.NPZFileNodes
grid_definition_path: /path/to/folder/with/grids/
resolution: o48
npz_file: /path/to/folder/with/grids/my_grid.npz
lat_key: latitudes
lon_key: longitudes

where `grid_definition_path` is the path to the folder containing the
grid definition files and `resolution` is the resolution of the grid to
be used.

By default, the grid files are supposed to be in the `grids` folder in
the same directory as the recipe file. The grid definition files are
expected to be name `"grid_{resolution}.npz"`.

.. note::

The NPZ file should contain the following keys:

- `longitudes`: The longitudes of the grid.
- `latitudes`: The latitudes of the grid.
where `npz_file` is the path to the NPZ file and `lat_key` and `lon_key`
are optional arguments with the key names of the latitude and longitude
arrays.
45 changes: 45 additions & 0 deletions graphs/docs/graphs/node_coordinates/reduced_gaussian.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#######################
Reduced Gaussian grid
#######################

A gaussian grid is a latitude/longitude grid where the spacing of the
latitudes is not regular but is symmetrical about the Equator. The grid
is identified by a grid code, which specifies the type (`n/N` for the
original ECMWF reduced Gaussian grid or `o/O` for the octahedral ECMWF
reduced Gaussian grid) and the resolution. The resolution is defined by
the number of latitude lines (`XXX`) between the pole and the Equator.

To enable retrieval of these grids, include the following lines in your
`.config/anemoi/settings.toml` file:

.. code:: toml

[graphs.named]
grids = "https://get.ecmwf.int/repository/anemoi/grids"

To define `node coordinates` based on a reduced gaussian grid, you can
use the following YAML configuration:

.. code:: yaml

nodes:
data: # name of the nodes
node_builder:
_target_: anemoi.graphs.nodes.ReducedGaussianGridNodes
grid: o48

Here, `grid` specifies the type and resolution of the reduced Gaussian
grid in the format `[o|n]XXX`. For example, `o48` represents an
octahedral Gaussian grid with 48 latitude lines between the pole and the
Equator.

.. note::

The reduced Gaussian grids are stored in NPZ files with the keys
latitudes and longitudes. These files are downloaded and cached in a
local directory. Initially, only a subset of grids is available. If
you require a new Gaussian grid to be added, please contact the
administrators.

Currently available reduced Gaussian grids: - o16 - o32 - o48 - o96 -
o160 - o256 - o320 - n320 - o1280
54 changes: 29 additions & 25 deletions graphs/src/anemoi/graphs/nodes/builders/from_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ class TextNodes(BaseNodeBuilder):

def __init__(self, dataset, name: str, idx_lon: int = 0, idx_lat: int = 1) -> None:
LOGGER.info("Reading the dataset from %s.", dataset)
self.dataset = np.loadtxt(dataset)
self.dataset = dataset
self.idx_lon = idx_lon
self.idx_lat = idx_lat
super().__init__(name)
Expand All @@ -91,20 +91,21 @@ def get_coordinates(self) -> torch.Tensor:
torch.Tensor of shape (num_nodes, 2)
A 2D tensor with the coordinates, in radians.
"""
return self.reshape_coords(self.dataset[self.idx_lat, :], self.dataset[self.idx_lon, :])
dataset = np.loadtxt(self.dataset)
return self.reshape_coords(dataset[self.idx_lat, :], dataset[self.idx_lon, :])


class NPZFileNodes(BaseNodeBuilder):
"""Nodes from NPZ defined grids.

Attributes
----------
resolution : str
The resolution of the grid.
grid_definition_path : str
Path to the folder containing the grid definition files.
grid_definition : dict[str, np.ndarray]
The grid definition.
npz_file : str
Path to the file.
lat_key : str
Name of the key of the latitude arrays.
lon_key : str
Name of the key of the latitude arrays.

Methods
-------
Expand All @@ -118,21 +119,25 @@ class NPZFileNodes(BaseNodeBuilder):
Update the graph with new nodes and attributes.
"""

def __init__(self, resolution: str, grid_definition_path: str, name: str) -> None:
def __init__(self, npz_file: str, name: str, lat_key: str = "latitudes", lon_key: str = "longitudes") -> None:
"""Initialize the NPZFileNodes builder.

The builder suppose the grids are stored in files with the name `grid-{resolution}.npz`.

Parameters
----------
resolution : str
The resolution of the grid.
grid_definition_path : str
Path to the folder containing the grid definition files.
npz_file : str
The path to the file.
name : str
Name of the nodes to be added.
lat_key : str, optional
Name of the key of the latitude arrays. Defaults to "latitudes".
lon_key : str, optional
Name of the key of the latitude arrays. Defaults to "longitudes".
"""
self.resolution = resolution
self.grid_definition_path = grid_definition_path
self.grid_definition = np.load(Path(self.grid_definition_path) / f"grid-{self.resolution}.npz")
self.npz_file = Path(npz_file)
self.lat_key = lat_key
self.lon_key = lon_key
super().__init__(name)

def get_coordinates(self) -> torch.Tensor:
Expand All @@ -143,7 +148,9 @@ def get_coordinates(self) -> torch.Tensor:
torch.Tensor of shape (num_nodes, 2)
A 2D tensor with the coordinates, in radians.
"""
coords = self.reshape_coords(self.grid_definition["latitudes"], self.grid_definition["longitudes"])
assert self.npz_file.exists(), f"{self.__class__.__name__}.file does not exists: {self.npz_file}"
grid_data = np.load(self.npz_file)
coords = self.reshape_coords(grid_data[self.lat_key], grid_data[self.lon_key])
return coords


Expand All @@ -152,17 +159,17 @@ class LimitedAreaNPZFileNodes(NPZFileNodes):

def __init__(
self,
resolution: str,
grid_definition_path: str,
npz_file: str,
reference_node_name: str,
name: str,
lat_key: str = "latitudes",
lon_key: str = "longiutdes",
mask_attr_name: str | None = None,
margin_radius_km: float = 100.0,
) -> None:

self.area_mask_builder = KNNAreaMaskBuilder(reference_node_name, margin_radius_km, mask_attr_name)

super().__init__(resolution, grid_definition_path, name)
super().__init__(npz_file, name, lat_key, lon_key)

def register_nodes(self, graph: HeteroData) -> None:
self.area_mask_builder.fit(graph)
Expand All @@ -177,10 +184,7 @@ def get_coordinates(self) -> np.ndarray:
)
area_mask = self.area_mask_builder.get_mask(coords)

LOGGER.info(
"Dropping %d nodes from the processor mesh.",
len(area_mask) - area_mask.sum(),
)
LOGGER.info("Dropping %d nodes from the processor mesh.", len(area_mask) - area_mask.sum())
coords = coords[area_mask]

return coords
90 changes: 90 additions & 0 deletions graphs/src/anemoi/graphs/nodes/builders/from_reduced_gaussian.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# (C) Copyright 2024 Anemoi contributors.
#
# This software is licensed under the terms of the Apache Licence Version 2.0
# which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.
#
# In applying this licence, ECMWF does not waive the privileges and immunities
# granted to it by virtue of its status as an intergovernmental organisation
# nor does it submit to any jurisdiction.

from __future__ import annotations

import logging
import os
import re
import tempfile
from functools import cached_property

import requests

from anemoi.graphs.nodes.builders.from_file import NPZFileNodes
from anemoi.utils.config import load_config

LOGGER = logging.getLogger(__name__)


class ReducedGaussianGridNodes(NPZFileNodes):
"""Nodes from a reduced gaussian grid.

A gaussian grid is a latitude/longitude grid. The spacing of the latitudes is not regular. However, the spacing of
the lines of latitude is symmetrical about the Equator. A grid is usually referred to by its 'number' N/O, which
is the number of lines of latitude between a Pole and the Equator. The N code refers to the original ECMWF reduced
Gaussian grid, whereas the code O refers to the octahedral ECMWF reduced Gaussian grid.

Attributes
----------
grid : str
The reduced gaussian grid, of shape {n,N,o,O}XXX with XXX latitude lines between the pole and
equator.

Methods
-------
get_coordinates()
Get the lat-lon coordinates of the nodes.
register_nodes(graph, name)
Register the nodes in the graph.
register_attributes(graph, name, config)
Register the attributes in the nodes of the graph specified.
update_graph(graph, name, attrs_config)
Update the graph with new nodes and attributes.
"""

def __init__(self, grid: int, name: str) -> None:
"""Initialize the ReducedGaussianGridNodes builder."""
assert re.fullmatch(
r"^[oOnN]\d+$", grid
), f"{self.__class__.__name__}.grid must match the format [n|N|o|O]XXX with XXX latitude lines between the pole and equator."
self.file_name = f"grid-{grid.lower()}.npz"
super().__init__(self.local_dir + "/" + self.file_name, name, lat_key="latitudes", lon_key="longitudes")
if not self.is_downloaded():
print(f"File {self.file_name} not found locally. Downloading...")
self.download_file()

@cached_property
def local_dir(self) -> str:
tmp_dir = tempfile.gettempdir().rstrip("/")
grids_dir = tmp_dir + "/.anemoi-gaussian_grids"
os.makedirs(grids_dir, exist_ok=True)
return grids_dir

@cached_property
def download_url(self) -> str:
config = load_config(defaults={"graphs": {"named": {}}})
return config["graphs"]["named"]["grids"].rstrip("/")

def is_downloaded(self) -> bool:
"""Checks if the grid file is already downloaded."""
return os.path.exists(self.npz_file)

def download_file(self):
"""Downloads the grid file if it is not already downloaded."""
url = self.download_url + "/" + self.file_name

LOGGER.info(f"Downloading {self.file_name} grid from: {url}")
response = requests.get(url)
if response.status_code == 200:
with open(self.npz_file, "wb") as f:
f.write(response.content)
LOGGER.info(f"File downloaded and saved to {self.local_dir}/.")
else:
raise FileNotFoundError(f"Failed to download file from {url}. HTTP status code: {response.status_code}")
4 changes: 2 additions & 2 deletions graphs/tests/nodes/test_arrays.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
import torch
from torch_geometric.data import HeteroData

from anemoi.graphs.nodes.attributes import AreaWeights
from anemoi.graphs.nodes.attributes import SphericalAreaWeights
from anemoi.graphs.nodes.attributes import UniformWeights
from anemoi.graphs.nodes.builders.from_vectors import LatLonNodes

Expand Down Expand Up @@ -51,7 +51,7 @@ def test_register_nodes():
assert graph["test_nodes"].node_type == "LatLonNodes"


@pytest.mark.parametrize("attr_class", [UniformWeights, AreaWeights])
@pytest.mark.parametrize("attr_class", [UniformWeights, SphericalAreaWeights])
def test_register_attributes(graph_with_nodes: HeteroData, attr_class):
"""Test LatLonNodes register correctly the weights."""
node_builder = LatLonNodes(latitudes=lats, longitudes=lons, name="test_nodes")
Expand Down
4 changes: 2 additions & 2 deletions graphs/tests/nodes/test_cutout_nodes.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from omegaconf import OmegaConf
from torch_geometric.data import HeteroData

from anemoi.graphs.nodes.attributes import AreaWeights
from anemoi.graphs.nodes.attributes import SphericalAreaWeights
from anemoi.graphs.nodes.attributes import UniformWeights
from anemoi.graphs.nodes.builders import from_file

Expand Down Expand Up @@ -44,7 +44,7 @@ def test_register_nodes(mocker, mock_zarr_dataset_cutout):
assert graph["test_nodes"].node_type == "ZarrDatasetNodes"


@pytest.mark.parametrize("attr_class", [UniformWeights, AreaWeights])
@pytest.mark.parametrize("attr_class", [UniformWeights, SphericalAreaWeights])
def test_register_attributes(mocker, mock_zarr_dataset_cutout, graph_with_nodes: HeteroData, attr_class):
"""Test ZarrDatasetNodes register correctly the weights with cutout operation."""
mocker.patch.object(from_file, "open_dataset", return_value=mock_zarr_dataset_cutout)
Expand Down
4 changes: 2 additions & 2 deletions graphs/tests/nodes/test_healpix.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
import torch
from torch_geometric.data import HeteroData

from anemoi.graphs.nodes.attributes import AreaWeights
from anemoi.graphs.nodes.attributes import SphericalAreaWeights
from anemoi.graphs.nodes.attributes import UniformWeights
from anemoi.graphs.nodes.builders.base import BaseNodeBuilder
from anemoi.graphs.nodes.builders.from_healpix import HEALPixNodes
Expand Down Expand Up @@ -46,7 +46,7 @@ def test_register_nodes(resolution: int):
assert graph["test_nodes"].node_type == "HEALPixNodes"


@pytest.mark.parametrize("attr_class", [UniformWeights, AreaWeights])
@pytest.mark.parametrize("attr_class", [UniformWeights, SphericalAreaWeights])
@pytest.mark.parametrize("resolution", [2, 5, 7])
def test_register_attributes(graph_with_nodes: HeteroData, attr_class, resolution: int):
"""Test HEALPixNodes register correctly the weights."""
Expand Down
10 changes: 5 additions & 5 deletions graphs/tests/nodes/test_node_attributes.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
import torch
from torch_geometric.data import HeteroData

from anemoi.graphs.nodes.attributes import AreaWeights
from anemoi.graphs.nodes.attributes import SphericalAreaWeights
from anemoi.graphs.nodes.attributes import UniformWeights


Expand All @@ -35,8 +35,8 @@ def test_uniform_weights_fail(graph_with_nodes: HeteroData, norm: str):


def test_area_weights(graph_with_nodes: HeteroData):
"""Test attribute builder for AreaWeights."""
node_attr_builder = AreaWeights()
"""Test attribute builder for SphericalAreaWeights."""
node_attr_builder = SphericalAreaWeights()
weights = node_attr_builder.compute(graph_with_nodes, "test_nodes")

assert weights is not None
Expand All @@ -46,7 +46,7 @@ def test_area_weights(graph_with_nodes: HeteroData):

@pytest.mark.parametrize("radius", [-1.0, "hello", None])
def test_area_weights_fail(graph_with_nodes: HeteroData, radius: float):
"""Test attribute builder for AreaWeights with invalid radius."""
"""Test attribute builder for SphericalAreaWeights with invalid radius."""
with pytest.raises(ValueError):
node_attr_builder = AreaWeights(radius=radius)
node_attr_builder = SphericalAreaWeights(radius=radius)
node_attr_builder.compute(graph_with_nodes, "test_nodes")
Loading
Loading