Skip to content

Support for columns contains only numbers.#737

Closed
gszecsenyi wants to merge 1 commit intosdv-dev:mainfrom
gszecsenyi:main
Closed

Support for columns contains only numbers.#737
gszecsenyi wants to merge 1 commit intosdv-dev:mainfrom
gszecsenyi:main

Conversation

@gszecsenyi
Copy link

@gszecsenyi gszecsenyi commented Nov 7, 2023

In the case where the column names contain only numbers, typically when scaling the data, the code doesn't work well because the first value of the variable will be a numeric value, not a string, to which you cannot append a string later.

The executed commands are:

synthesizer = SingleTablePreset(
metadata,
name='FAST_ML'
)

synthesizer.fit(
data=test_df
)

synthetic_data = synthesizer.sample(
num_rows=500
)

synthetic_data.head()

The output:

File [~/GitHub/ml_network_analysis_experiments/.venv/lib/python3.9/site-packages/rdt/transformers/base.py:367](https://file+.vscode-resource.vscode-cdn.net/Users/**********/GitHub/ml_network_analysis_experiments/~/GitHub/ml_network_analysis_experiments/.venv/lib/python3.9/site-packages/rdt/transformers/base.py:367), in BaseTransformer._set_seed(self, data) 365 hash_value = self.columns[0] 366 for value in data.head(5): --> 367 hash_value += str(value) 369 hash_value = int(hashlib.sha256(hash_value.encode('utf-8')).hexdigest(), 16) 370 self.random_seed = hash_value % ((2 ** 32) - 1) # maximum value for a seed

This is why this modifications are needed.

self.column_prefix = '#'.join(map(str, self.columns))

and

hash_value = str(self.columns[0])

@gszecsenyi gszecsenyi requested a review from a team as a code owner November 7, 2023 21:54
@gszecsenyi gszecsenyi requested review from lajohn4747 and removed request for a team November 7, 2023 21:54
@gszecsenyi gszecsenyi changed the title Support for columns containing only numbers. Support for columns contains only numbers. Nov 7, 2023
@sdv-dev sdv-dev deleted a comment from CLAassistant Nov 8, 2023
@npatki
Copy link
Contributor

npatki commented Nov 8, 2023

Hello! Thanks for your interest in contributing to the SDV software. Before we are able to review or approve your code changes, we require that you read and sign our new Contributor License Agreement (CLA).

To request a CLA, please fill out the required information in this form: https://bit.ly/sdv-cla-form

Once we receive your submission, we'll get back to you with more details. Thanks, and let us know if you have any questions.

@gszecsenyi
Copy link
Author

Thank you, I submitted my contact data.

@gszecsenyi
Copy link
Author

Is there any update? :)

@lajohn4747
Copy link
Contributor

Hi @gszecsenyi this was resolved in sdv-dev/SDV#1976. This change is included in SDV version 1.13.0

@lajohn4747 lajohn4747 closed this Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants