Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All NaN column produces NaN values #121

Closed
csala opened this issue Sep 15, 2020 · 0 comments
Closed

All NaN column produces NaN values #121

csala opened this issue Sep 15, 2020 · 0 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@csala
Copy link
Contributor

csala commented Sep 15, 2020

  • Reversible Data Transforms version: 0.2.5

Description

When a column contains all NaN values, rdt fails to remove it completely from the output.

This happens because when the NullTransformer is applied it replaces all the null values with the average of the column, which ends up being NaN again when the column is all NaN values.

Solution would be to replace the average with a fixed value, like 0, if the average ends up being NaN.

Example

In [1]: import pandas as pd                                                                                                           

In [2]: import numpy as np                                                                                                            

In [3]: df = pd.DataFrame({ 
   ...:     'a': [np.nan], 
   ...: })                                                                                                                            

In [4]: from rdt import HyperTransformer                                                                                              

In [5]: ht = HyperTransformer()                                                                                                       

In [6]: ht.fit_transform(df)                                                                                                          
Out[6]: 
   a#0  a#1
0  NaN  1.0
@csala csala self-assigned this Sep 15, 2020
@csala csala added this to the 0.2.5 milestone Sep 15, 2020
@csala csala closed this as completed in daba6d4 Sep 18, 2020
@csala csala added bug Something isn't working and removed enhancement labels Sep 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant