You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I use the auto_assign_transformers functionality with invalid data*, then I receive an error that doesn't really make sense.
*Invalid data is any data that does not match the metadata
fromsdv.single_tableimportGaussianCopulaSynthesizerfromsdv.metadataimportSingleTableMetadataimportnumpyasnpimportpandasaspdmetadata=SingleTableMetadata.load_from_dict({
'columns': {
'a': { 'sdtype': 'categorical' },
}
})
synthesizer=GaussianCopulaSynthesizer(metadata)
# input data that does not match the metadatadata=pd.DataFrame({'b': list(np.random.choice(['M', 'F'], size=10)) })
synthesizer.auto_assign_transformers(data)
Output:
AttributeError: 'NoneType' object has no attribute 'get'
Expected behavior
I expect an error that is more descriptive to the problem. We should re-use the error message from using fit on invalid data.
synthesizer.fit(data)
InvalidDataError: The provided data does not match the metadata:
The columns ['b'] are not present in the metadata.
The metadata columns ['a'] are not present in the data.
Additional context
It appears that fit (and fit_processed_data) are actually running a validation check between the data and metadata. It seems that the auto_assign_transformers method is NOT running the check.
Should we run the check in this method? If so, maybe the fit functions don't need it (since they internally call this method first).
The text was updated successfully, but these errors were encountered:
Problem Description
If I use the
auto_assign_transformers
functionality with invalid data*, then I receive an error that doesn't really make sense.*Invalid data is any data that does not match the metadata
Output:
Expected behavior
I expect an error that is more descriptive to the problem. We should re-use the error message from using
fit
on invalid data.Additional context
It appears that
fit
(andfit_processed_data
) are actually running a validation check between the data and metadata. It seems that theauto_assign_transformers
method is NOT running the check.Should we run the check in this method? If so, maybe the fit functions don't need it (since they internally call this method first).
The text was updated successfully, but these errors were encountered: