Skip to content

Lingering memory connections when extracting underlying np.arrays from datasets #8728

Closed
@ks905383

Description

@ks905383

What is your issue?

I know that generally, ds2 = ds connects the two objects in memory, and changes in one will also cause changes in the other.

However, I generally assume that certain operations should break this connection, for example:

  • extracting the underlying np.array from a dataset (changing its type and destroying a lot of the xarray-specific information: index, dimensions, etc.)
  • using the underlying np.array into a new dataset

In other words, I would expect that using ds['var'].values would be similar to copy.deepcopy(ds['var'].values).

Here's an example that illustrates how in these cases, the objects are still linked in memory:

(apologies for the somewhat hokey example)

import xarray as xr
import numpy as np

# Create a dataset
ds = xr.Dataset(coords = {'lon':(['lon'],np.array([178.2,179.2,-179.8, -178.8,-177.8,-176.8]))})
print('\nds: ')
print(ds)

# Create a new dataset that uses the values of the first dataset
ds2 = xr.Dataset({'lon1':(['lon'],ds.lon.values)},
                  coords = {'lon':(['lon'],ds.lon.values)})
print('\nds2: ')
print(ds2)

# Change ds2's 'lon1' variable 
ds2['lon1'][ds2['lon1']<0] = 360 + ds2['lon1'][ds2['lon1']<0]

# `ds2` is changed as expected
print('\nds2 (should be modified): ')
print(ds2)

# `ds` is changed, which is *not* expected
print('\nds (should not be modified): ')
print(ds)

The question is - am I right (from a UX perspective) to expect these kinds of operations to disconnect the objects in memory? If so, I might try to update the docs to be a bit clearer on this. (or, alternatively, if these kinds of operations should disconnect the objects in memory, maybe it's better to have .values also call .copy(deep=True).values)

Appreciate y'all's thoughts on this!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions