-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Multinomial and Bernoulli Naive Bayes variants #4053
Add Multinomial and Bernoulli Naive Bayes variants #4053
Conversation
Conflicts: python/cuml/naive_bayes/naive_bayes.py
…parts that do work.
…when fitting all classes at once instead of one at a time.
… little differently than sklearn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks very nice. Found only minor things as usual.
from cuml.common.kernel_utils import cuda_kernel_factory | ||
|
||
|
||
def _binarize_kernel(x_dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not something that has to be fixed in this PR since it's being used everywhere already, but CuPy now supports writing kernels with template arguments so we should be able to remove the use of the cuda_kernel_factory
everywhere in the codebase. It should also make our kernel invocations look much more clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending successful CI. It looks like there are a couple failures in the pickling tests. Thanks Mickael!
Codecov Report
@@ Coverage Diff @@
## branch-21.08 #4053 +/- ##
===============================================
Coverage ? 85.77%
===============================================
Files ? 231
Lines ? 18261
Branches ? 0
===============================================
Hits ? 15664
Misses ? 2597
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
@gpucibot merge |
This is a continuation of PR #1763 and #4053, to add Gaussian Naive Bayes. This is supposed to be merged after #4053 Here is a comparison of cuML and SKLearn performance on Gaussian NB. This is done using a synthetic dataset generated by make_regression. The GPU used is a RTX 8000, and the CPU is i9-10920X @ 3.50GHz  Linking issue #1666 Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #4079
This is a continuation of PR #1763, #4053, and #4079, to add Categorical Naive Bayes. This is supposed to be merged after #4079. Linking issue #1666. Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #4150
This is a continuation of PR rapidsai#1763, to add Multinomial and Bernoulli NB variants. The Gaussian and Categorical variants will be added in a following PR. Also linking issue rapidsai#1666 Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4053
This is a continuation of PR rapidsai#1763 and rapidsai#4053, to add Gaussian Naive Bayes. This is supposed to be merged after rapidsai#4053 Here is a comparison of cuML and SKLearn performance on Gaussian NB. This is done using a synthetic dataset generated by make_regression. The GPU used is a RTX 8000, and the CPU is i9-10920X @ 3.50GHz  Linking issue rapidsai#1666 Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4079
This is a continuation of PR rapidsai#1763, rapidsai#4053, and rapidsai#4079, to add Categorical Naive Bayes. This is supposed to be merged after rapidsai#4079. Linking issue rapidsai#1666. Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4150
This is a continuation of PR #1763, to add Multinomial and Bernoulli NB variants.
The Gaussian and Categorical variants will be added in a following PR.
Also linking issue #1666