-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: enable setitem dim2 test to work for EA with complex128 dtype #54445
Comments
Additional code synchronizations (and the addition of a dtype-preserving map method). These changes were initially developed to support uncertainties, but the uncertainty changes have all been stripped out to simplify merging of underlying code. Once these changes are fully synced with a release version of Pandas 2.1, we can look at adding back uncertainties. These changes also tolerate complex128 as a base type for magnitudes, with one except (under discussion as pandas-dev/pandas#54445). Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>
Until Pandas resolves pandas-dev/pandas#54445 we cannot feed complex128 types to the test_setitem_2d_values test case. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>
When a loc indexer goes to create a fresh array to hold values from an Extension Array, it makes the array allocation based on the na_value of the EA, but that na_value may be smaller than the size of the things that the EA can hold (such as complex128). Note that np.nan is itself a legitimate complex128 value. If the allocated array cannot hold the values from the block manager, and if the EA is not immutable, it will try a second strategy of allocating an array based on the dtype of the values of the blocks. If the blocks hold complex numbers, the array will be properly sized. This should close pandas-dev#54445. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>
From the wording it sounds like FloatingArray is not what you would actually want here? What type/dtype would you want here? |
We have complex128 magnitudes to store, which cannot be stored in a FloatingArray. An array sized for complex128 would be great. |
Apologies for slow responses...storms knocked out power yesterday and still no estimate of restoration. I did some tests and noticed that Pandas 2.0.2 passes this test with complex128, but not 2.1.0rc0. Still digging in... |
Tracked it down: The fast_xs method in 2.1.0rc0 makes PintArray with [, ] whereas 2.0.x version makes it with [np.nan, np.nan]. The former demands FloatingArray, whereas the latter creates float64 array. 2.0.x silently converts (2+0j) to 2.0 before sticking it into the float64 array. 2.1.0rc0 properly complains that complex won't fit in float. When using a "true" complex number (such as (2+1j) the 2.0.x version complains that float argument must be a string or a real number, not 'complex'. When I get internet back I'll fix the test case. But the problem remains. One hack would be to not create an empty array, but an array holding the first value, then discard that value with [:1], preserving the underlying array type. |
sorry if this is adding noise, but any chance #54508 fixes the |
As requested in review feedback, change tests to use request.node.addmarker in test_*.py testcases, rather than xfailing in base class tests. Also, prototyped an attempt to create properly-sized EAs for complex numbers so that dim2 test case passes for complex (pandas-dev#54445). Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
I have read the threads related to DISCUSS/API: setitem-like operations should only update inplace (#39584) and friends (including #47577).
My problem arises from this test code from the Pint-Pandas test suite:
As a result of PR #54441 I'm able to test Pint-Pandas with complex128 datatypes. PintArray EAs work perfectly with float and integer magnitudes, but fail in just this one case, with complex128. The failure starts in
core/internals/managers.py
in thefast_xs
method:result
becomes a PintArray backed by a FloatingArray :The FloatingArray comes when the
PintArray
initializer finds nothing helpful in eitherdtype
norvalues
and falls back to creating apd.array(values, ...)
:The fast_xs fails when
result[rl]
is not ready to accept the complex128 data coming fromblk.iget((i, loc))
:As I see it, the problem is that we commit too soon to building our backing array with too-limited information.
@andrewgsavage
@topper-123
@jbrockmendel
@mroeschke
Feature Description
I will point out for the record:
So we have everything we need within the environment of fast_xs. Should we use this knowledge as power to create a
result
that can hold slices of data beyond float64? Here's code that tries to use the fast path, but if an exception is raised, it does the sure thing:Alternative Solutions
I'm open to alternative solutions, but the above actually causes the test case to pass. Should I submit a PR?
Additional Context
No response
The text was updated successfully, but these errors were encountered: