Open
Description
I've held this view in low confidence for a while and wanted to socialize it to see whether there's something to it: Should we keep attrs in operations by default?
Advantages:
- I think most of the time people want to keep attrs after operations
- Is that right? Are there cases where it wouldn't be a reasonable default? e.g. good points here for not always keeping coords around
- It's easy to remove them with a (currently unimplemented)
drop_attrs
method when people do want to remove them
Disadvantages:
- Backward incompatible change with an expensive deprecate cycle (would be impractical to have a deprecation warning every time someone ran a function on an object with attrs I think? At least without adding a
once
filter warning) - ?
Here are some existing relevant discussions:
- Opening from zarr.ZipStore fails to read (store???) unicode characters #3815 (comment)
- Keep attrs & Add a 'keep_coords' argument to Dataset.apply #688
- Global option to always keep/discard attrs on operations #2482
- DataArray.quantile does not honor
keep_attrs
#3304
I think this is an easy situation to get into:
- We make an incorrect-but-insignificant design decision; e.g. some methods don't keep attrs
- We want to change that, but avoid breaking backward-compatibility
- So we add kwargs and eventually a global config
- But now we have a global config that requires global context and lots of kwargs! :(
I'm up for leaning towards breaking changes if it makes the library better: I think xarray will grow immensely, and so the narrow immediate pain is worth the broader future positive impact. Clearly if the immediate pain stops xarray growing, then it's not a good tradeoff.