-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer _metadata from Subclassed DataFrame to Subclassed Series #19850
Comments
this is not implemented in any way. It IS possible, but slightly non-trivial to actually make this work properly. it is similar to #13208. We don't have the support for this in indexing. If you have interest in making this happen, then please submit a PR. |
I see. @property
def _constructor_sliced(self):
def f(*args, **kwargs):
# adapted from https://github.com/pandas-dev/pandas/issues/13208#issuecomment-326556232
return DesignSeries(*args, **kwargs).__finalize__(self, method='inherit')
return f which right now won't work, allows to process the inheritance in |
See #18258 where |
oh! I see! def __new__( cls, *args, **kwargs ):
# arr is mandatory, first argument or key `arr`.
if isinstance(kwargs.get('arr', args[0]), ABCSparseArray):
from pandas.core.sparse.series import SparseSeries
cls = SparseSeries
obj = object.__new__(cls)
obj.__init__(*args, **kwargs)
return obj This way the check is kept in the constructor itself. |
That's a possibility indeed, but a change in API that we need to discuss (I am not familiar enough with sparse to really understand the possible consequences).
but I am not fully sure what the difference / (dis)advantage of that is compared to SparseSeries. Probably best to open a separate issue about that first. |
Hi, I ran into this problem, and I've been thinking about it for a bit. I think, since this issue was opened, quite a few things have changed in Pandas, so maybe it's worth taking another look at this. Returning a function in |
That's indeed possible, as far as I know. In the GeoPandas project we are actually using a function as the return value in
What do you mean exactly with this sentence? |
I meant calling For example here: Lines 3425 to 3449 in 08104e8
What about adding |
Ah, OK, I see. In principle, it should not be needed to call There will be places in pandas where we don't (yet) call |
We have a general issue about improve the coverage in pandas where we call |
So, the propagation of metadata from a DataFrame to a Series is generally the same problem as all the other missing calls of |
Code Sample, a copy-pastable example if possible
Let's assume the following subclassing case:
Problem description
This works fine, but does not allow to transfer extended metadata from the
ExtendedFrame
to theExtendedSeries
.As far as I understand, writing the
_constructor_sliced
as follows should work:this would allow to first set the metadata and then return the object to initialice its data. Isn't it?
But defining it this way gives errors in
core/frame:2166
andcore/frame:2563
. In both cases this is due to the callself._constructor_sliced._from_array()
.Seeing that
Series.from_array
has been labeled as deprecated and thatSeries._from_array
calls the class' constructor. Couldn't it be possible to just change the two instances ofself._constructor_sliced._from_array()
toself._constructor_sliced()
?If I'm seeing this correctly, wouldn't this change allow for this level of flexibility in subclassing without affecting the regular functionality?
Output of
pd.show_versions()
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Darwin
OS-release: 17.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.21.1
pytest: None
pip: 9.0.1
setuptools: 38.4.0
Cython: None
numpy: 1.14.0
scipy: 0.18.1
pyarrow: None
xarray: None
IPython: 5.4.1
sphinx: 1.6.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.5.1
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: