Update docs on view / copies #8744

ks905383 · 2024-02-13T16:14:40Z

Add reference to numpy docs on view / copies in the corresponding section of the xarray docs, to help clarify Lingering memory connections when extracting underlying np.arrays from datasets #8728 .
Add note that other xarray operations also return views rather than copies in the Copies vs. Views section of the docs
Add note that da.values() returns a view in the header for da.values().

- Add reference to numpy docs on view / copies in the corresponding section of the xarray docs, to help clarify pydata#8728 . - Add note that `da.values()` returns a view in the header for `da.values()`.

welcome · 2024-02-13T16:14:43Z

Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient.
If you have questions, some answers may be found in our contributing guidelines.

max-sixty · 2024-02-13T18:37:28Z

xarray/core/dataarray.py

    def values(self) -> np.ndarray:
        """
-        The array's data as a numpy.ndarray.
+        A view onto the DataArray's underlying data as a numpy.ndarray.


(I'm no expert here.)

Is this correct? I would have thought that it's returning the ndarray. And then that can get used as a view in many situations, consistent with numpy's rules.

If my sense is correct, we could just comment that it's not copied, and so mutations to the array will also affect the dataarray?

Thanks for the clarification! How's this?

"The array's underlying numpy.ndarray. Note that this array is not copied; operations on it follow numpy's rules of what generates a view vs. a copy, and changes to this array will be reflected in the DataArray as well."

Perfect I think!

(though to confirm, I'm not confident that the initial thing isn't so accurate, fine to wait for others to comment if we prefer...)

what we're doing is roughly

return np.asarray(self._data)

(with slightly different behavior for 0d arrays with datetime64 or timedelta64 dtype)

This means that "The array's data converted to numpy.ndarray." would be a more accurate summary, but we could definitely emphasize the forwarding towards numpy.asarray / numpy.array a bit more (the current "If the array's data is not a numpy.ndarray this will attempt to convert it naively using np.array()" is wrong, numpy.asarray is always called).

We already have a note about non-castable arrays, so I'd probably append a warning about this not copying the data for numpy (I don't think we can or should cover all the edge cases of numpy.asarray here).

Great, thanks.

So let's merge after this is done...

How's this?
"The array's data converted to numpy.ndarray. Note that this array is not copied; operations on it follow numpy's rules of what generates a view vs. a copy, and changes to this array may be reflected in the DataArray as well."

if there's two line breaks between the first and second sentences I don't have any objections (the first line is supposed to be a brief summary). Just make sure to coordinate the stuff that's already there:

If the array's data is not a numpy.ndarray this will attempt to convert it naively using np.array(), which will raise an error if the array type does not support coercion like this (e.g. cupy).

Sounds good - from your comment above, should "convert it naively using np.array()" just be changed to "convert it naively using np.asarray()" then? Or should that clause just be struck

I'm not sure about the difference between asarray and array (I think it's different defaults), but in any case the important part is that the function is always applied to the data, not just when it's not already a numpy array.

Ohhh okay - makes sense. I think the tweaks in the latest push should address that

for more information, see https://pre-commit.ci

xarray/core/dataarray.py

ks905383 · 2024-03-11T17:29:18Z

I think the failing check is something that came from the latest main branch update, so once that gets resolved, I'll merge this barring any other comments?

keewis

thanks, this looks good to me now.

The failing tests are flaky (not sure why only on python=3.11, though), a rerun should fix that. I'll try once the MacOS run completed.

ks905383 · 2024-03-25T20:17:13Z

does anything else need to happen before it's cleared for merging btw?

welcome · 2024-03-25T20:35:22Z

Congratulations on completing your first pull request! Welcome to Xarray! We are proud of you, and hope to see you again!

* main: (26 commits) [pre-commit.ci] pre-commit autoupdate (pydata#8900) Bump the actions group with 1 update (pydata#8896) New empty whatsnew entry (pydata#8899) Update reference to 'Weighted quantile estimators' (pydata#8898) 2024.03.0: Add whats-new (pydata#8891) Add typing to test_groupby.py (pydata#8890) Avoid in-place multiplication of a large value to an array with small integer dtype (pydata#8867) Check for aligned chunks when writing to existing variables (pydata#8459) Add dt.date to plottable types (pydata#8873) Optimize writes to existing Zarr stores. (pydata#8875) Allow multidimensional variable with same name as dim when constructing dataset via coords (pydata#8886) Don't allow overwriting indexes with region writes (pydata#8877) Migrate datatree.py module into xarray.core. (pydata#8789) warn and return bytes undecoded in case of UnicodeDecodeError in h5netcdf-backend (pydata#8874) groupby: Dispatch quantile to flox. (pydata#8720) Opt out of auto creating index variables (pydata#8711) Update docs on view / copies (pydata#8744) Handle .oindex and .vindex for the PandasMultiIndexingAdapter and PandasIndexingAdapter (pydata#8869) numpy 2.0 copy-keyword and trapz vs trapezoid (pydata#8865) upstream-dev CI: Fix interp and cumtrapz (pydata#8861) ...

Clarify pydata#8728 in docs

8a7016e

- Add reference to numpy docs on view / copies in the corresponding section of the xarray docs, to help clarify pydata#8728 . - Add note that `da.values()` returns a view in the header for `da.values()`.

max-sixty reviewed Feb 13, 2024

View reviewed changes

ks905383 and others added 5 commits February 26, 2024 15:14

tweaks to the header

3476475

[pre-commit.ci] auto fixes from pre-commit.com hooks

81adbbe

for more information, see https://pre-commit.ci

Merge branch 'main' into improve_view_docs

b6dfe3b

Merge branch 'main' into improve_view_docs

caccc02

Merge branch 'main' into improve_view_docs

95187e7

keewis reviewed Mar 7, 2024

View reviewed changes

xarray/core/dataarray.py Outdated Show resolved Hide resolved

ks905383 added 2 commits March 11, 2024 13:03

flip order of new .to_values() doc header paragraphs

327228b

Merge branch 'main' into improve_view_docs

8112b84

keewis approved these changes Mar 11, 2024

View reviewed changes

dcherian changed the title ~~Clarify pydata#8728 in docs~~ Update docs on view / copies Mar 25, 2024

dcherian merged commit 2f34895 into pydata:main Mar 25, 2024

kmuehlbauer mentioned this pull request Jun 5, 2024

Lingering memory connections when extracting underlying np.arrays from datasets #8728

Closed

Uh oh!

Update docs on view / copies #8744

Update docs on view / copies #8744

Uh oh!

Conversation

ks905383 commented Feb 13, 2024

Uh oh!

welcome bot commented Feb 13, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keewis Feb 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keewis Feb 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ks905383 commented Mar 11, 2024

Uh oh!

keewis left a comment

Choose a reason for hiding this comment

Uh oh!

ks905383 commented Mar 25, 2024

Uh oh!

welcome bot commented Mar 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

keewis Feb 13, 2024 •

edited

Loading

keewis Feb 26, 2024 •

edited

Loading