Skip to content

Smarter weighted aggregation #5082

@schlunma

Description

@schlunma

✨ Feature Request

It would be nice to use cubes or cell_measures as weights in weighted aggregation, e.g.,

cube.collapsed(['latitude', 'longitude'], iris.analysis.SUM, weights=area_cube)

or

cube.collapsed(['latitude', 'longitude'], iris.analysis.SUM, weights='cell area')

which automatically handles unit conversions (e.g., multiply by m2 for area-weighted sums).

Motivation

Currently, weights for aggregation need to be arrays. Thus, units for operations like area-weighted sums are not handled correctly, which requires extra code.

Additional context

If this is something that's interesting, I can open a pull request. I already have working prototype in this branch. The main feature is a Weights class that inherits from np.ndarray; thus, changes in the code base are minimal.

`Weights` class
class Weights(np.ndarray):
    """Class for handling weights for weighted aggregation.
    Since it inherits from :numpy:`ndarray`, all common methods and properties
    of this are available.
    Details on subclassing :numpy:`ndarray` are given here:
    https://numpy.org/doc/stable/user/basics.subclassing.html
    """

    def __new__(cls, weights, cube):
        """Create class instance.
        Args:
        * weights (Weights, numpy.ndarray, Cube, string):
            If given as :class:`iris.analysis.Weights`, simply use this. If
            given as a :class:`numpy.ndarray`, use this directly (assume units
            of `1`). If given as a :class:`iris.cube.Cube`, use its data and
            units. If given as a :obj:`str`, assume this is the name of a cell
            measure from ``cube`` and its data and units.
        * cube (Cube):
            Input cube for aggregation. If weights is given as :obj:`str`, try
            to extract a cell measure with the corresponding name from this
            cube. Otherwise, this argument is ignored.
        """
        # Weights is Weights
        if isinstance(weights, cls):
            obj = weights

        # Weights is a cube
        # Note: to avoid circular imports of Cube we use duck typing using the
        # "hasattr" syntax here
        elif hasattr(weights, "add_aux_coord"):
            obj = np.asarray(weights.data).view(cls)
            obj.units = weights.units

        # Weights is a string
        elif isinstance(weights, str):
            cell_measure = cube.cell_measure(weights)
            if cell_measure.shape != cube.shape:
                arr = iris.util.broadcast_to_shape(
                    cell_measure.data,  # fails for dask arrays
                    cube.shape,
                    cube.cell_measure_dims(cell_measure),
                )
            else:
                arr = cell_measure.data
            obj = np.asarray(arr).view(cls)
            obj.units = cell_measure.units

        # Remaining types (e.g., np.ndarray): try to convert to ndarray.
        else:
            obj = np.asarray(weights).view(cls)
            obj.units = Unit("1")

        return obj

    def __array_finalize__(self, obj):
        """See https://numpy.org/doc/stable/user/basics.subclassing.html."""
        if obj is None:
            return
        self.units = getattr(obj, "units", Unit("1"))

    @classmethod
    def update_kwargs(cls, kwargs, cube):
        """Update ``weights`` keyword argument in-place.
        Args:
        * kwargs (dict):
            Keyword arguments that will be updated in-place if a ``weights``
            keyword is present.
        * cube (Cube):
            Input cube for aggregation. If weights is given as :obj:`str`, try
            to extract a cell measure with the corresponding name from this
            cube. Otherwise, this argument is ignored.
        """
        if "weights" not in kwargs:
            return
        kwargs["weights"] = cls(kwargs["weights"], cube)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions