Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement tensor object #254

Open
sandorkertesz opened this issue Nov 9, 2023 · 0 comments
Open

Implement tensor object #254

sandorkertesz opened this issue Nov 9, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@sandorkertesz
Copy link
Collaborator

sandorkertesz commented Nov 9, 2023

A tensor object would represent multidimensional labelled data and provide coordinate based data access and slicing.

The current scope is rather limited and only want to replicate some xarray functionality.

Proposed features:

  • convert a fieldlist into a tensor using to_tensor()
  • the users have to specify the metadata keys to form the tensor on
  • these will define the coords
  • at the moment the coords are extended with the following additional coords:
    • latitude, longitude for regular grids in lat and lon
    • x, y for other regular grids
    • values for irregular grids
  • the tensor can only be formed if all the fields have the same grid and for each metadata combination there is exactly one field in the fieldlist. No holes allowed
  • no concept of a variable or dimension as in xarray
  • slicing methods: bracket, sel(), isel()
  • lat-lon access: latitudes, longitudes()
  • no computation methods
  • creating a object with copy(data=my_data) (see the notebook example)

Questions:

  • allow attaching attributes?
  • how to use it in an easy way in computations? E.g. computing the average along a given dimension
  • in a fieldlist the equivalent of coords are called indices
# 3 params on 6 pressure levels
>>> ds = from_source("file", "tuv_pl.grib")
>>> t = ds.to_tensor("param", "level")
>>> t.coords.keys
dict_keys(['param', 'level', 'latitude', 'longitude'])
>>> t.coords
Coordinates:
  param        [str] t, u, v
  level        [int] 300, 400, 500, 700, 850, 1000
  latitude     [float64] 90.0, 60.0, 30.0, 0.0, -30.0, -60.0, -90.0
  longitude    [float64] 0.0, 30.0, 60.0, 90.0, 120.0, 150.0, 180.0, 210.0, 240.0 ,..., 330.0
>>> t.shape
(3, 6, 7, 12)
>>> t.to_numpy().shape
(3, 6, 7, 12)
# slicing
>>> r = t[1:3,0]
>>> r.shape
(2, 1, 7, 12)

For more details see the example: https://earthkit-data.readthedocs.io/en/feature-tensor/examples/grib_cube.html

@sandorkertesz sandorkertesz added the enhancement New feature or request label Nov 9, 2023
@sandorkertesz sandorkertesz self-assigned this Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant