Skip to content

Add to_pandas method for Dataset #5247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 4, 2021
Merged

Conversation

gcaria
Copy link
Contributor

@gcaria gcaria commented May 2, 2021

  • Closes Add Dataset.to_pandas() method #255
  • Tests added
  • Passes pre-commit run --all-files
  • User visible changes (including notable bug fixes) are documented in whats-new.rst
  • New functions/methods are listed in api.rst

@gcaria gcaria force-pushed the dataset_to_pandas branch from 2234bd3 to 88027cf Compare May 2, 2021 21:05
@max-sixty
Copy link
Collaborator

Seems very reasonable to me; others' thoughts?

@shoyer
Copy link
Member

shoyer commented May 2, 2021

The model we have with DataArray.to_pandas() is that it converts into a corresponding pandas object without changing the number of dimensions. to_dataframe()/to_series() will flatten multiple dimensions into a MultiIndex, but that isn't the role of to_pandas().

Thus:

  • 0D xarray.DataArray -> 0D NumPy scalar
  • 1D xarray.DataArray -> 1D pandas.Series
  • 2D xarray.DataArray -> 2D pandas.DataFrame
  • 3D or higher xarray.DataArray -> error

If we treat a Dataset like a DataArray with one extra dimension (corresponding to variables), then it would make sense to have:

  • 0D xarray.Dataset -> 1D pandas.Series
  • 1D xarray.Dataset -> 2D pandas.DataFrame (same as to_dataframe())
  • 2D or higher xarray.Dataset -> error

I guess this is basically what you have here, except for raising an error in the final case (the error message should mention to_dataframe().

@gcaria gcaria force-pushed the dataset_to_pandas branch from 88027cf to 6889854 Compare May 3, 2021 13:05
Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!

@mathause mathause merged commit 4aef8f9 into pydata:master May 4, 2021
@mathause
Copy link
Collaborator

mathause commented May 4, 2021

Thanks @gcaria!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Dataset.to_pandas() method
4 participants