Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unvetted DataTree methods #9585

Merged
merged 8 commits into from
Oct 7, 2024
Merged

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Oct 6, 2024

This PR deletes the many Dataset methods that were copied onto DataTree without unit tests, along with a few that are not implemented properly yet, e.g.,

  1. Arithmetic methods were removed, because DataTree + Dataset should probably raise an error.
  2. Indexing and aggregation methods were removed, because these should allow for dimensions that are missing only on some nodes.
  3. The untested map_over_subtree_inplace and render methods were removed.
  4. A few other methods (e.g., merge and plot) that were only implemented by raising `NotImplementedError`` are entirely removed instead.

As
[discussed](https://docs.google.com/presentation/d/1zBjEsihBhK_U972jxHwaAZBbzS1-hd3aDLnO9uu2Ob4/edit#slide=id.g3087b787633_13_0)
in the last DataTree meeting, this PR deletes the many Dataset methods
that were copied onto DataTree without unit tests, along with a few that
are not implemented properly yet, e.g.,

1. Arithmetic methods were removed, because `DataTree + Dataset` should
   probably raise an error.
2. Indexing and aggregation methods were removed, because these should
   allow for dimensions that are missing only on some nodes.
3. The untested `map_over_subtree_inplace` and `render` methods were
   removed.
3. A few other methods (e.g., `merge` and `plot`) that were only
   implemented by raising `NotImplementedError`` are entirely removed
   instead.
@shoyer shoyer changed the title Remove dt methods Remove unvetted DataTree methods Oct 6, 2024
@shoyer
Copy link
Member Author

shoyer commented Oct 6, 2024

I have an implementation of isel and sel that I will put up for review after we merge this!

@shoyer
Copy link
Member Author

shoyer commented Oct 6, 2024

Reimplemented aggregations also coming soon...

@shoyer
Copy link
Member Author

shoyer commented Oct 6, 2024

I've commented out the parts of the doc that reference removed DataTree methods.

This is ready for review now.

@TomNicholas TomNicholas added the topic-DataTree Related to the implementation of a DataTree class label Oct 6, 2024
@TomNicholas
Copy link
Member

Arithmetic methods were removed, because DataTree + Dataset should probably raise an error.

xref #9365 (comment)

Indexing and aggregation methods were removed, because these should allow for dimensions that are missing only on some nodes.

xref #8949

The untested map_over_subtree_inplace and render methods were removed.

👍

A few other methods (e.g., merge and plot) that were only implemented by raising `NotImplementedError`` are entirely removed instead.

xref #9349 and #9270

Comment on lines +728 to +739

.. DataTree.assign_coords
.. DataTree.merge
.. DataTree.rename
.. DataTree.rename_vars
.. DataTree.rename_dims
.. DataTree.swap_dims
.. DataTree.expand_dims
.. DataTree.drop_vars
.. DataTree.drop_dims
.. DataTree.set_coords
.. DataTree.reset_coords
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the decision here that these methods should act locally rather than mapping over the subtree? And the distinction being between methods which manipulate structure versus methods which manipulate data - the latter map over the whole tree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had not thought about how these methods should work, I was just removing them from the docs because they no longer have an implementation!

That said, my inclination is that mapping over the subtree is probably appropriate for all of these. But many will probably need some special handling so they don't raise errors about missing variables/dimensions.

Comment on lines +782 to +785
.. DataTree.reindex
.. DataTree.reindex_like
.. DataTree.set_index
.. DataTree.reset_index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear to me whether these methods would count as "manipulating structure" or "manipulating data"...

doc/api.rst Show resolved Hide resolved
doc/getting-started-guide/quick-overview.rst Outdated Show resolved Hide resolved
@shoyer shoyer enabled auto-merge (squash) October 7, 2024 11:36
@shoyer shoyer merged commit 1d7c0f6 into pydata:main Oct 7, 2024
28 checks passed
@shoyer shoyer deleted the remove-dt-methods branch October 7, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-DataTree Related to the implementation of a DataTree class
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disable methods implemented by map_over_subtree and inheritance from Dataset
2 participants