-
-
Notifications
You must be signed in to change notification settings - Fork 324
RelativeFeatures: add fill_value param to be used when dividing by zero #653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RelativeFeatures: add fill_value param to be used when dividing by zero #653
Conversation
…d to add logic to _div()
hi @solegalli, Are there any issues w/ the CircleCI tests? I know it sounds crazy. When I locally run pytest, I don't see the errors mentioned in CircleCI. For example, this error does not exist: This article provides a few suggestions as to why the code passes local but an error is returned on CircleCI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Morgan-Sell
This is looking really good.
I have a few cosmetic requests, and then also to remove the option to replace with a string.
The tests should pass once we merge #660
raise ValueError( | ||
"Some of the reference variables contain 0 as values. Check and " | ||
"remove those before using this transformer." | ||
) | ||
varname = [f"{var}_div_{reference}" for var in self.variables] | ||
X[varname] = X[self.variables].div(X[reference], axis=0) | ||
X.replace([-np.inf, np.inf], self.fill_value, inplace=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we can use inplace with the latest pandas version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read that there are a few pandas methods that will continue using inplace
. It seems that replace() is one of them.
Most recent documentation for replace() - https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alrighty! thanks for the heads-up
@@ -403,3 +423,31 @@ def test_get_feature_names_out_raises_error_when_wrong_param( | |||
|
|||
with pytest.raises(ValueError): | |||
transformer.get_feature_names_out(input_features=_input_features) | |||
|
|||
|
|||
def test_transformer_fill_values_when_division_by_zero(df_vartypes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use parametrize to test simultaneously various values of fill_value, including none, int and float
Could you expand please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expanded it to test both int and float. I don't see how I can test None
in this unit test because the transformer will throw an error.
fill_value = None
and division by zero is captured in test_error_when_division_by_zero_and_fill_value_is_none()
Hey @Morgan-Sell Please rebase main, we just merged #660 :) That solves some test failures. |
hi @solegalli, I rebased the branch, but the errors are still being produced. Any thoughts? |
Created
fill_value
param in init().Closes #577.