Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cumsum() changes dtype to object #42

Closed
SGStino opened this issue Aug 7, 2020 · 3 comments
Closed

cumsum() changes dtype to object #42

SGStino opened this issue Aug 7, 2020 · 3 comments

Comments

@SGStino
Copy link

SGStino commented Aug 7, 2020

i

2019-12-28 00:00:52.470569300+00:00     -661212.7629936477
2019-12-28 00:00:56.358419200+00:00    -1383400.0425049507
2019-12-28 00:00:58.889932700+00:00     -894082.3358455619
2019-12-28 00:01:01.422784900+00:00     -874452.8101116775
                                              ...         
2019-12-28 23:57:23.719467400+00:00                    0.0
2019-12-28 23:58:00.080144500+00:00      2151.174441538055
2019-12-28 23:58:02.064671800+00:00                    0.0
2019-12-28 23:59:39.443425300+00:00      5761.132695684438
2019-12-28 23:59:41.927944600+00:00                    0.0
Length: 12928, dtype: pint[joule]

i.cumsum()

2019-12-28 00:00:52.470569300+00:00     -661212.7629936477 joule
2019-12-28 00:00:56.358419200+00:00    -2044612.8054985984 joule
2019-12-28 00:00:58.889932700+00:00    -2938695.1413441603 joule
2019-12-28 00:01:01.422784900+00:00     -3813147.951455838 joule
                                                 ...            
2019-12-28 23:57:23.719467400+00:00    -4767888585.5850525 joule
2019-12-28 23:58:00.080144500+00:00     -4767886434.410611 joule
2019-12-28 23:58:02.064671800+00:00     -4767886434.410611 joule
2019-12-28 23:59:39.443425300+00:00     -4767880673.277915 joule
2019-12-28 23:59:41.927944600+00:00     -4767880673.277915 joule
Length: 12928, dtype: object

this is likely caused by np.cumsum on an PintArray resulting in an numpy array of Quantity objects because it falls back to a loop of + operations?

My manual workaround is:
my_series.to_frame('power').pint.dequantify().cumsum().pint.quantify()

@andrewgsavage
Copy link
Collaborator

this is likely caused by np.cumsum on an PintArray resulting in an numpy array of Quantity objects because it falls back to a loop of + operations?
I think so. Pandas would receive a numpy array with nothing to tell it to interpret it as a PintArray so it uses object dtype.

I'm assuming this is the case for all numpy functions. I haven't looked how numpy operations work pandas, we may need to handle them within pint-pandas for them to work nicely.
This would use pint's numpy functionality hgrecco/pint#981, similar to what's done for addition/subtraction https://github.com/hgrecco/pint-pandas/blob/master/pint_pandas/pint_array.py#L539

For your workaround, there is a private method for creating a PintArray from a numpy array. I'm not sure what units your workaround results in.
PintArray._from_sequence( i.cumsum().values )

@MichaelTiemannOSC
Copy link
Collaborator

Just a note that I've run into this to. My workaround involves letting cumsum wreck the PintArray and then re-creating it with a call to astype(). But it would be so nice if cumsum would honor the Extension Array.

@andrewgsavage
Copy link
Collaborator

fixed by #186

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants