Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float formatter learn_rounding_scheme doesn't work on all digits #556

Closed
npatki opened this issue Sep 28, 2022 · 0 comments · Fixed by #591
Closed

Float formatter learn_rounding_scheme doesn't work on all digits #556

npatki opened this issue Sep 28, 2022 · 0 comments · Fixed by #591
Assignees
Labels
bug Something isn't working
Milestone

Comments

@npatki
Copy link
Contributor

npatki commented Sep 28, 2022

Environment Details

  • RDT version: 1.2.1

Error Description

When using a FloatFormatter with learn_rounding_scheme=True, we expect the transformer to learn the maximum # of significant digits. In practice, we see the following:

  1. If data has 0-14 digits, then the transformer learns rounding scheme [Working as intended]
  2. If the data has 15+ digits, then the transformer learns 0 digits, producing whole numbers instead [Bug]

We expect that case 2 to work. Or as a fallback, at least stop enforcing the rounding if there are already a large number of digits.

Steps to reproduce

import pandas as pd
from rdt import HyperTransformer
from rdt.transformers.numerical import FloatFormatter

# create test data with 16 digits
test_data = pd.DataFrame(data={
    'column': [1.1234567890123456]
})

ht = HyperTransformer()
ht.set_config({
    'sdtypes': { 'column': 'numerical' },
    'transformers': { 'column': FloatFormatter(learn_rounding_scheme=True)}  
})

t = ht.fit_transform(test_data)
ht.reverse_transform(t)

Output: 1.0 (no digits learned)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants