Skip to content

My thoughts on coordinate #48

Closed
Closed
@martindurant

Description

@martindurant

Sorry for getting distracted at the end of the geo-zarr meeting we just had (for those that were there). Here is a summary of what I was getting at.

(@rabernat , yes I know this has been discussed many times over - apologies)

There are two principal parts to the coordinates problem:

  • coordinate tranform
  • parsing/reading coordinate definitions

Coordinate transform

A mechanism within zarr/xarray to find (each of) the coordinates of a given array position and the (fractional) array location of a given coordinate set. This should be a vectorized operation each way.

Currently, xarray supports explicit coordinate value arrays via the netCDF model well (and "flexible" indexes whose internals I don't understand well).

  • I suggest that this should be an extension point, each associated with a different internal representation (e.g., affine is usually a square matrix, explicit arrays are usually one- or two-dimensional arrays with sizes determined by the data)
  • on day 1, we want to support explicit values and affine (linear transform)
  • other transforms should be pluggable, and eventually include for instance the large number of each curvature models built into grib
  • whether we should have a single affine matrix across all dimensions (lon, lat, time = f(x, y, z)), or if we should split dimensions (lon, lat = f1(x, y); time = f2(z)) is a decision to be taken early.
  • the coordinates interface must support slicing and might support units.

Crucially, I advocate that the transform mechanism is independent of the data domain, so that we don't treat "lon/lat" as special. This is because zarr and xarray are general purpose libraries, and we don't want to exclude microscopy, genetics and other fields with many users.

Coordinate definitions

In the meeting, a few specific (geo) coordinate definitions were mentioned:

  • gdal coefficients
  • tiff bounding box
  • CRS text/parameters

plus, of course, netCDF explicit arrays (with or without CF). I also mentioned astro WCS as a reference point (which supports explicit, affine, and various analytic forms for arbitrary dimensionality with no geo reference; interestingly, it also applies to fields of tables).

I would suggest that it is the job of geo-zarr to build the converters to and from these styles of definitions to transform internal representation, such that you can round-trip coordinate information without losing accuracy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions