-
-
Notifications
You must be signed in to change notification settings - Fork 324
refactor code to work with pandas 2.0 #660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor code to work with pandas 2.0 #660
Conversation
From pandas 2.0 any only accepts keyworkd arguments ref pandas-dev/pandas#44896
I have not fully understood why this solve the problem, but splitting the operation in 2 lines does not seem to work
Now pandas.to_datetime raises a warning when the column cannot be converted
Pandas dataframes created from python integers are created with int column types `int64` but the operation tested returns `int32` which caused issues
Merging dfs with different column lelvels has been disallowed ref pandas-dev/pandas#34862
I am not sure why this caused an issue, maybe due to type casting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ClaudioSalvatoreArcidiacono
Thanks for taking care of this fixes.
I've got to small questions. Could you please have a look?
Thank you!
@@ -216,7 +216,7 @@ def test_raises_error_when_nan_in_transform(): | |||
"unit, expected", | |||
[ | |||
("D", [31, 61, 183]), | |||
("M", [1.018501, 2.004148, 6.012444]), | |||
("M", [1.0, 1.967741935483871, 5.903225806451613]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are the values changing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need some time to deep dive into this, it is strange indeed. Might be that something has changed with subtraction between times or that something has changed with divisions of time deltas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most likely the latter as the test with days passes with no problem. Might as well be a bug with the new version of pandas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a bug with the latest version of pandas. Please see pandas-dev/pandas#52806
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want, we could also fix this by reverting the change in the tests and by changing line 321 of datetime /datetime_subtraction.py.py as follows
def _sub(self, dt_df: pd.DataFrame):
"""make datetime subtraction"""
new_df = pd.DataFrame()
for reference in self.reference_:
new_varnames = [f"{var}_sub_{reference}" for var in self.variables_]
new_df[new_varnames] = (
dt_df[self.variables_]
.sub(dt_df[reference], axis=0)
- .apply(lambda s: s / np.timedelta64(1, self.output_unit))
+ .div(np.timedelta64(1, self.output_unit).astype("timedelta64[ns]"))
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, .div
looks more pro than the lambda function.
Awesome work, thanks for digging this out!
It would be great if you can take care of this, then we are good to merge :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! And thanks for your review comments!
I have made the change from the previous comment.
Let me know if you have other comments.
Before merging, it would be great to run tests with pandas 1.5.x just to double check that changes are backward compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried downgrading to pandas 1.5.3 in my laptop and running the tests. Everything passed for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing! Thank you!
fixes #659