Skip to content

QST: Subclassing and concat #35415

Open
Open
@EmilianoJordan

Description

@EmilianoJordan

I was looking into df.astype not respecting subclassed objects and noticed the issue had been resolved by modifying pd.concat to respect subclassed objects using the following code:

cons = self.objs[0]._constructor

Line 464 PR #29627

cons = self.objs[0]._constructor_expanddim

Line 477 PR #33884 (Set to be released on 1.1.0)

This assumes homogeneous object types, or at the very least that the first object is representative. My assumption was that functions on the pandas namespace did not respect subclassing and only methods on the objects did. Leaving developers to implement their own special use cases for subclassing. This seemed like a good delimitation to me, both logically and programmatically. But, now diving deeper into functions I don't regularly use I see I was mistaken as merge and merge_ordered both respect the left arg's subclassing.

With this in mind, I have some questions.

  1. Should there be more documentation around subclassing? This would include how concat, merge, merge_ordered etc handle subclassing.
  2. If functions on the pandas names space are going to implement varying degrees of support for subclassing then, should there be a constructor kwarg added to functions like read_sql and read_csv?
  3. Does to_hdf not working with subclassing need to be fixed?

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementReshapingConcat, Merge/Join, Stack/Unstack, ExplodeSubclassingSubclassing pandas objects

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions