Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a bayesian gaussian mixture model transformer #329

Merged
merged 33 commits into from
Dec 17, 2021
Merged

Conversation

fealho
Copy link
Member

@fealho fealho commented Dec 1, 2021

Resolve #183.

Copy link
Contributor

@amontanez24 amontanez24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments but it's looking good so far!

rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
one_hot[np.arange(data.shape[0]), discrete_column] = 1.0
data = np.concatenate([data[continuous_name][:, None], one_hot], axis=1)
print(data.shape)
one_hot = np.zeros(shape=(data.shape[1], self._number_of_modes))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we changing the indexing?

tests/unit/transformers/test_numerical.py Outdated Show resolved Hide resolved
tests/unit/transformers/test_bayes_gmm.py Outdated Show resolved Hide resolved
tests/code_style.py Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Dec 13, 2021

Codecov Report

Merging #329 (755bd7e) into master (8ca1406) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##            master      #329    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           12        12            
  Lines          829      1010   +181     
==========================================
+ Hits           829      1010   +181     
Impacted Files Coverage Δ
rdt/transformers/numerical.py 100.00% <100.00%> (ø)
rdt/hyper_transformer.py 100.00% <0.00%> (ø)
rdt/transformers/datetime.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ca1406...755bd7e. Read the comment docs.

Copy link
Contributor

@amontanez24 amontanez24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty good. I just had a few questions.

The main thing I am wondering is if the NullTransformer should just be used directly. I believe that you are using the NumericalTransformer super methods for that reason, and since eventually the NullTransformer will be pulled out, maybe it makes sense to just use it directly and not use the NumericalTransformer at all

rdt/transformers/numerical.py Show resolved Hide resolved
rdt/transformers/numerical.py Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
@fealho fealho marked this pull request as ready for review December 14, 2021 16:47
@fealho fealho requested a review from a team as a code owner December 14, 2021 16:47
@fealho
Copy link
Member Author

fealho commented Dec 14, 2021

@amontanez24 The CopulasTransformer doesn't accept the rounding, min_value, max_value parameters. Do you know why? If there is some reason, maybe I should remove from the bgmTransformer as well.

@amontanez24
Copy link
Contributor

@amontanez24 The CopulasTransformer doesn't accept the rounding, min_value, max_value parameters. Do you know why? If there is some reason, maybe I should remove from the bgmTransformer as well.

@amontanez24 The CopulasTransformer doesn't accept the rounding, min_value, max_value parameters. Do you know why? If there is some reason, maybe I should remove from the bgmTransformer as well.

I don't think there is a reason. We just didn't add it when we created those parameters

Copy link
Contributor

@amontanez24 amontanez24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready!

@fealho fealho requested a review from csala December 14, 2021 18:04
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
rdt/transformers/numerical.py Outdated Show resolved Hide resolved
Copy link
Contributor

@csala csala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the feedback @fealho.
This looks good to go, now!

@fealho fealho merged commit 38b0c60 into master Dec 17, 2021
@fealho fealho deleted the issue-183-BayesianGMM branch December 17, 2021 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a BayesianGMM Transformer
4 participants