-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for aligned chunks when writing to existing variables #8459
Conversation
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
@max-sixty thanks so much for tackling this tricky issue. Can you help me understand whether the following edge case is covered here... For our existing "safe chunks" check, we have a special case for the last chunk, which does not have to evenly divide the zarr chunks: xarray/xarray/backends/zarr.py Lines 185 to 189 in bb8511e
(i.e. we just skip the check for the last chunk; With Is handling these sorts of scenarios in scope for this PR? |
Yes! sorry for the delay. I didn't realize that we could skip this guardrail with |
This is a great observation but I think we can merge this PR for two reasons:
Of course, it'd be nice to catch this case in the future. |
I think this adresses also, the problem of this issue: #8876 |
* main: (26 commits) [pre-commit.ci] pre-commit autoupdate (pydata#8900) Bump the actions group with 1 update (pydata#8896) New empty whatsnew entry (pydata#8899) Update reference to 'Weighted quantile estimators' (pydata#8898) 2024.03.0: Add whats-new (pydata#8891) Add typing to test_groupby.py (pydata#8890) Avoid in-place multiplication of a large value to an array with small integer dtype (pydata#8867) Check for aligned chunks when writing to existing variables (pydata#8459) Add dt.date to plottable types (pydata#8873) Optimize writes to existing Zarr stores. (pydata#8875) Allow multidimensional variable with same name as dim when constructing dataset via coords (pydata#8886) Don't allow overwriting indexes with region writes (pydata#8877) Migrate datatree.py module into xarray.core. (pydata#8789) warn and return bytes undecoded in case of UnicodeDecodeError in h5netcdf-backend (pydata#8874) groupby: Dispatch quantile to flox. (pydata#8720) Opt out of auto creating index variables (pydata#8711) Update docs on view / copies (pydata#8744) Handle .oindex and .vindex for the PandasMultiIndexingAdapter and PandasIndexingAdapter (pydata#8869) numpy 2.0 copy-keyword and trapz vs trapezoid (pydata#8865) upstream-dev CI: Fix interp and cumtrapz (pydata#8861) ...
While I don't feel super confident that this is designed to protect against any bugs, it does solve the immediate problem in #8371, by hoisting the encoding check above the code that runs for only new variables. The encoding check is somewhat implicit, so this was an easy thing to miss prior.
whats-new.rst