ENH: DataFrame.struct.explode(column, *, separator=".")
method to pull struct subfields into the parent DataFrame
#59585
Labels
Arrow
pyarrow functionality
Enhancement
Needs Discussion
Requires discussion from core team before further action
Needs Info
Clarification about behavior needed to assess issue
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
Currently, I can use
Series.struct.explode()
to create aDataFrame
out of the subfields of aArrowDtype(pa.struct(...))
column. Joining these back to the originalDataFrame
is a little awkward. It'd be nice to have a top-levelexplode()
similar to howDataFrame.explode()
works on lists.Feature Description
Add a new StructFrameAccessor to
pandas/core/arrays/arrow/accessors.py
. I think implementation could be almost identical to what I did here: googleapis/python-bigquery-dataframes#916Alternative Solutions
An alternative could be to modify
DataFrame.explode
to support exploding a struct into columns. Potentially with anaxis
parameter to explode into columns instead of rows.Additional Context
See also, the
Series.struct
accessor added last year: #54977The text was updated successfully, but these errors were encountered: