You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please indicate the following details about the environment in which you found the bug:
RDT version: main branch
Python version: Any
Operating System: Any
Error Description
Both the RandomLocationGenerator and RegionalAnonymizer crash when running on one column with the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-42-a7c40dec790d> in <cell line: 29>()
27 })
28
---> 29 ht.fit(data)
4 frames
/usr/local/lib/python3.10/dist-packages/rdt/hyper_transformer.py in fit(self, data)
707 field = column
708
--> 709 data = self._fit_field_transformer(data, field, self.field_transformers[field])
710
711 self._validate_all_fields_fitted()
/usr/local/lib/python3.10/dist-packages/rdt/hyper_transformer.py in _fit_field_transformer(self, data, field, transformer)
635 transformer.fit(data, columns_to_sdtypes)
636 else:
--> 637 transformer.fit(data, field)
638
639 self._transformers_sequence.append(transformer)
/usr/local/lib/python3.10/dist-packages/rdt/transformers/base.py in wrapper(self, *args, **kwargs)
53 method_name = function.__name__
54 with set_random_states(self.random_states, method_name, self.set_random_state):
---> 55 return function(self, *args, **kwargs)
56
57 return wrapper
/usr/local/lib/python3.10/dist-packages/rdt/transformers/base.py in fit(self, data, columns_to_sdtypes)
567 Dictionary mapping each column to its sdtype.
568 """
--> 569 self._validate_columns_to_sdtypes(data, columns_to_sdtypes)
570 self.columns_to_sdtypes = columns_to_sdtypes
571 self._store_columns(list(self.columns_to_sdtypes.keys()), data)
/usr/local/lib/python3.10/dist-packages/rdt/transformers/base.py in _validate_columns_to_sdtypes(self, data, columns_to_sdtypes)
543 def _validate_columns_to_sdtypes(self, data, columns_to_sdtypes):
544 """Check that all the columns in ``columns_to_sdtypes`` are present in the data."""
--> 545 missing = set(columns_to_sdtypes.keys()) - set(data.columns)
546 if missing:
547 missing_to_print = ', '.join(missing)
AttributeError: 'str' object has no attribute 'keys'
Steps to reproduce
To reproduce you can just assign one of these transformers to one column from this dataset:
fromrdt.transformersimportRandomLocationGenerator, RegionalAnonymizerht=HyperTransformer()
ht.detect_initial_config(data)
ht.update_sdtypes(column_name_to_sdtype={
'country of departure': 'country_code',
'region of departure': 'administrative_unit',
'region code of departure': 'state_abbr',
'city of departure': 'city',
'postal code of departure': 'postcode',
'street address of departure': 'street_address',
'secondary address of departure': 'secondary_address',
'country of arrival': 'country_code',
'region of arrival': 'administrative_unit',
'region code of arrival': 'state_abbr',
'city of arrival': 'city',
'postal code of arrival': 'postcode',
'street address of arrival': 'street_address',
'secondary address of arrival': 'secondary_address'
})
# only assign a single column (country of departure) to the RandomLocationGeneratorht.update_transformers(column_name_to_transformer={
('country of departure'): RandomLocationGenerator(locales=['en_US', 'es_ES', 'en_GB'], missing_value_generation='random')
})
ht.fit(data)
The text was updated successfully, but these errors were encountered:
Environment Details
Please indicate the following details about the environment in which you found the bug:
Error Description
Both the
RandomLocationGenerator
andRegionalAnonymizer
crash when running on one column with the following error:Steps to reproduce
To reproduce you can just assign one of these transformers to one column from this dataset:
The text was updated successfully, but these errors were encountered: