Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve user warnings and logic for update_sdtype #684

Closed
amontanez24 opened this issue Aug 16, 2023 · 0 comments · Fixed by #710
Closed

Improve user warnings and logic for update_sdtype #684

amontanez24 opened this issue Aug 16, 2023 · 0 comments · Fixed by #710
Assignees
Labels
feature request Request for a new feature
Milestone

Comments

@amontanez24
Copy link
Contributor

amontanez24 commented Aug 16, 2023

Problem Description

As a user, I want to be able to update the sdtype of any column, even if they're involved in a multi-column transformer. If I do so, the config should still be valid..

Expected behavior

update_sdtypes

If you update an sdtype of a column in a multi-column transformer, and the new sdtype is no longer compatible with the transformer then:

  1. show a warning
  2. REMOVE it from the transformer
  3. Choose a new, compatible transformer instead.
# before
>>> ht.get_config()
{
  'sdtypes': {
    'A': 'city',
    'B': 'state',
    'C': 'country'
  },
  'transformers': {
    ('A', 'B', 'C'): address.RandomLocationGenerator()
  }
}

# updating to an incompatible sdtype for the RandomLocationGenerator
>>> ht.update_sdtypes(column_name_to_sdtype={
         'A': 'phone_number',
         'B': 'categorical'
})


Warning: sdtype 'phone_number' is incompatible with transformer 'RandomLocationGenerator'. Assigning a new transformer to it.
Warning: sdtype 'categorical' is incompatible with transformer 'RandomLocationGenerator'. Assigning a new transformer to it.
>>> ht.get_config()
{
  'sdtypes': {
    'A': 'phone_number',
    'B': 'state',
    'C': 'country'
  },
  'transformers': {
    'A': phone_number.AnonymizedGeoExtractor(),
    'B': UniformEncoder()
    ('C'): address.RandomLocationGenerator()
  }
}
# edge case: If there is only column in it, then replace the transformer entirely
>>> ht.get_config()
{
  'sdtypes': {
    'D': 'city',
  },
  'transformers': {
    ('D'): RandomLocationGenerator()
  }
}

# updating to invalid sdtype for the RandomLocationGenerator
>>> ht.update_sdtypes(column_name_to_sdtype={
         'D': 'phone_number'
})
Warning: sdtype 'phone_number' is incompatible with transformer 'RandomLocationGenerator'. Assigning a new transformer to it.


>>> ht.get_config()
{
  'sdtypes': {
    'D': 'phone_number'
  },
  'transformers': {
    'D': phone_number.AnonymizedGeoExtractor()
  }
}
@amontanez24 amontanez24 added the feature request Request for a new feature label Aug 16, 2023
@amontanez24 amontanez24 changed the title Improve user warnings and logic for update methods Improve user warnings and logic for update_sdtype Aug 16, 2023
@amontanez24 amontanez24 modified the milestones: 1.8.0, 1.9.0 Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants