Skip to content

RelativeFeatures: add fill_value param to be used when dividing by zero #653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Morgan-Sell
Copy link
Collaborator

Createdfill_value param in init().

Closes #577.

@Morgan-Sell
Copy link
Collaborator Author

Morgan-Sell commented Apr 11, 2023

hi @solegalli,

Are there any issues w/ the CircleCI tests? I know it sounds crazy. When I locally run pytest, I don't see the errors mentioned in CircleCI.

For example, this error does not exist: test_transformer_fill_values_when_division_by_zero - AssertionError: Attributes of DataFrame.iloc[:, 3] (column name="Marks") ar..

This article provides a few suggestions as to why the code passes local but an error is returned on CircleCI

Copy link
Collaborator

@solegalli solegalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Morgan-Sell

This is looking really good.

I have a few cosmetic requests, and then also to remove the option to replace with a string.

The tests should pass once we merge #660

raise ValueError(
"Some of the reference variables contain 0 as values. Check and "
"remove those before using this transformer."
)
varname = [f"{var}_div_{reference}" for var in self.variables]
X[varname] = X[self.variables].div(X[reference], axis=0)
X.replace([-np.inf, np.inf], self.fill_value, inplace=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we can use inplace with the latest pandas version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read that there are a few pandas methods that will continue using inplace. It seems that replace() is one of them.

Most recent documentation for replace() - https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alrighty! thanks for the heads-up

@@ -403,3 +423,31 @@ def test_get_feature_names_out_raises_error_when_wrong_param(

with pytest.raises(ValueError):
transformer.get_feature_names_out(input_features=_input_features)


def test_transformer_fill_values_when_division_by_zero(df_vartypes):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could use parametrize to test simultaneously various values of fill_value, including none, int and float

Could you expand please?

Copy link
Collaborator Author

@Morgan-Sell Morgan-Sell Apr 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expanded it to test both int and float. I don't see how I can test None in this unit test because the transformer will throw an error.

fill_value = None and division by zero is captured in test_error_when_division_by_zero_and_fill_value_is_none()

@solegalli
Copy link
Collaborator

Hey @Morgan-Sell

Please rebase main, we just merged #660 :)

That solves some test failures.

@Morgan-Sell
Copy link
Collaborator Author

hi @solegalli,

I rebased the branch, but the errors are still being produced. Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RelativeFeatures: add option not to fail when dividing by zero and instead fill inf values with something
2 participants