-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Assignment of pyarrow arrays yield unexpected dtypes #58601
base: main
Are you sure you want to change the base?
BUG: Assignment of pyarrow arrays yield unexpected dtypes #58601
Conversation
…ssignment-unexpected-dtypes
…ssignment-unexpected-dtypes
pandas/core/frame.py
Outdated
from pandas.compat._constants import REF_COUNT | ||
from pandas.compat._optional import import_optional_dependency | ||
from pandas.compat.numpy import function as nv | ||
|
||
if not pa_version_under11p0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What prevents this from working with pyarrow 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing prevents it ! :-)
It was a misanderstanding.
At the beginning I thought that the version 11.0.0 was the minimum version (because "Minimum Versions" tests in the CI were not working with pyarrow 10). But actually value
in _sanitize_column
is:
- A
pa.lib.ChunkedArray
in version10.0.1
- A
pa.lib.Array
in versions above
So now I just test ifvalue
is an instance of the parent classpa.lib._PandasConvertible
…ssignment-unexpected-dtypes
…s://github.com/droussea2001/pandas into BUG-56994/pyarrow-assignment-unexpected-dtypes
…ssignment-unexpected-dtypes
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
I'm still working on it. |
@jorisvandenbossche : Hello Joris, hope you're doing well would you have any news about BUG: Assignment of pyarrow arrays yields unexpected dtypes ? Previously I was saying that "if a Series or a DataFrame is created or assigned with data dtype 'X' I would expect if it is possible, that it keeps the same 'X' dtype" but its only my humble opinion :-) and I would be equally happy to talk about it : maybe there is somewhere a discussion about how pyarrow types could be managed ? (thanks in advance !) |
Friendly ping @jorisvandenbossche |
doc/source/whatsnew/v3.0.0.rst
file if fixing a bug or adding a new feature.During a column assignment in a
DataFrame
:value
is a pyarrow array (check tested withlib.is_pyarrow_array
)sanitize_array
setsdtype
toArrowDtype(value.type)
sanitize_array
processing are applied