Description
I was looking into df.astype
not respecting subclassed objects and noticed the issue had been resolved by modifying pd.concat
to respect subclassed objects using the following code:
cons = self.objs[0]._constructor
cons = self.objs[0]._constructor_expanddim
Line 477 PR #33884 (Set to be released on 1.1.0)
This assumes homogeneous object types, or at the very least that the first object is representative. My assumption was that functions on the pandas namespace did not respect subclassing and only methods on the objects did. Leaving developers to implement their own special use cases for subclassing. This seemed like a good delimitation to me, both logically and programmatically. But, now diving deeper into functions I don't regularly use I see I was mistaken as merge
and merge_ordered
both respect the left
arg's subclassing.
With this in mind, I have some questions.
- Should there be more documentation around subclassing? This would include how
concat
,merge
,merge_ordered
etc handle subclassing. - If functions on the pandas names space are going to implement varying degrees of support for subclassing then, should there be a
constructor
kwarg added to functions likeread_sql
andread_csv
? - Does
to_hdf
not working with subclassing need to be fixed?