[REVIEW] RF: Variable binning and other minor refactoring #4479

venkywonka · 2022-01-12T14:39:03Z

This PR enables variable bins capped to max_n_bins for the feature-quantiles. This makes Decision Trees more robust for a wider variety of datasets by avoiding redundant bins for columns having fewer uniques.
Added tests for the same
Some accompanying changes in naming and format of the structures used for passing input and quantiles
Deleting the cpp/test/sg/decisiontree_batchedlevel_* files as they are not tested.
changed param n_bins to max_nbins in C++ layer to differentiate it's meaning with the actual n_bins used here
The python layer still maintains n_bins but have tweaked the docstrings to convey that it denotes the "maximum bins used"
Other variable renamings in the core Decision Tree classes

This PR does not improve perf of GBM-bench datasets by much as almost all of the features have unique values that exceed the n_bins used.

RAMitchell

Could you create a small struct like:

// Non-owning data structure for accessing quantiles
// Can be used in kernels
class Quantiles{
private:
DataT *quantiles; // csr matrix where each row contains quantiles for a feature 
int * row_offsets;
IdxT num_features;
int max_bins;
public:
IdxT FeatureValueToBinIdx(IdxT feature_idx, DataT value){}
DataT BinIdxToFeatureValue(IdxT feature_idx, IdxT bin_idx){}
};

Maybe the above doesn't work because we need to cache in shared memory, perhaps you can think of something. It is useful to semantically group functionality into an object instead of having a bunch of variables or pointers hanging around 'loose' in kernels. We can make the pointers private and control the way they are used via the public interface. These are just thoughts, so ok if it doesn't work.

cpp/src/decisiontree/batched-levelalgo/kernels/quantiles.cu

RAMitchell · 2022-01-14T11:01:07Z

cpp/src/decisiontree/batched-levelalgo/kernels/quantiles.cu

+  __syncthreads();
+
+  for (int bin = threadIdx.x; bin < n_bins; bin += blockDim.x) {
+    if (bin >= unq_nbins) break;


Shouldn't the above loop just be < unq_nbins?

Sidenote, google c++ style guide and c++ core guidelines recommend not abbreviating function names. Long is okay (within reason), aim for clear and unambiguous. Variable names are critical to code readability/maintenance.

oh yea 😅
thanks rory 👍🏻

venkywonka · 2022-01-17T14:41:08Z

Could you create a small struct like:
// Non-owning data structure for accessing quantiles
// Can be used in kernels
class Quantiles{
private:
DataT *quantiles; // csr matrix where each row contains quantiles for a feature 
int * row_offsets;
IdxT num_features;
int max_bins;
public:
IdxT FeatureValueToBinIdx(IdxT feature_idx, DataT value){}
DataT BinIdxToFeatureValue(IdxT feature_idx, IdxT bin_idx){}
};
Maybe the above doesn't work because we need to cache in shared memory, perhaps you can think of something. It is useful to semantically group functionality into an object instead of having a bunch of variables or pointers hanging around 'loose' in kernels. We can make the pointers private and control the way they are used via the public interface. These are just thoughts, so ok if it doesn't work.

I have separated quantiles from Input struct into it's own struct that is used both in host and device. Hope that addresses the problem.

cpp/src/decisiontree/batched-levelalgo/quantiles.cuh

cpp/src/randomforest/randomforest.cuh

RAMitchell

I think there is an accidentally duplicated quantile kernel here but otherwise, all looks good.

RAMitchell · 2022-01-28T20:53:43Z

cpp/src/decisiontree/batched-levelalgo/kernels/builder_kernels.cuh

-                                Input<DataT, LabelT, IdxT> input,
-                                NodeWorkItem* work_items,
+__global__ void nodeSplitKernel(const IdxT max_depth,
+                                const IdxT min_samples_leaf,


Good job making these const. Less import for single integers but very important for pointers (or structs containing pointers).

RAMitchell · 2022-01-28T20:57:22Z

cpp/src/decisiontree/batched-levelalgo/kernels/builder_kernels_impl.cuh

-  for (IdxT b = threadIdx.x; b < nbins; b += blockDim.x)
-    sbins[b] = input.quantiles[col * nbins + b];
+  for (IdxT b = threadIdx.x; b < max_nbins; b += blockDim.x) {
+    if (b >= nbins) break;


I think you can just change the loop condition.

you're right, have changed it 👍🏾

RAMitchell · 2022-01-28T21:05:21Z

cpp/src/decisiontree/batched-levelalgo/kernels/quantiles.cu

+
+  if (threadIdx.x == 0) {
+    // make quantiles unique, in-place
+    auto new_last = thrust::unique(thrust::device, quantiles, quantiles + max_nbins);


https://forums.developer.nvidia.com/t/does-thrust-transform-invoke-dynamic-parallelism/44121/2

oh I see, thanks for the article rory... haven't encountered such an error here...but have changed policy to thrust::seq to be on safer side 👍🏾

RAMitchell · 2022-01-28T21:06:44Z

cpp/src/decisiontree/batched-levelalgo/quantiles.cu

+namespace DT {
+
+template <typename T>
+__global__ void computeQuantilesKernel(


Am I seeing things or is this kernel defined twice?

oh that's on me 😅 thank you for the catch!

RAMitchell · 2022-01-28T21:08:33Z

cpp/src/decisiontree/batched-levelalgo/quantiles.cu

+{
+  double bin_width = static_cast<double>(n_rows) / max_nbins;
+
+  for (int bin = threadIdx.x; bin < max_nbins; bin += blockDim.x) {


I see you changed this to use a single block, which makes sense.

RAMitchell · 2022-01-28T21:11:16Z

python/cuml/dask/ensemble/randomforestclassifier.py

@@ -106,7 +106,7 @@ class RandomForestClassifier(BaseRandomForestModel, DelayedPredictionMixin,
         * If ``'log2'`` then ``max_features=log2(n_features)/n_features``.
         * If ``None``, then ``max_features = 1.0``.
    n_bins : int (default = 128)
-        Number of bins used by the split algorithm.
+        Maximum number of bins used by the split algorithm per feature.


Good clarification!

vinaydes

Looks good to me apart from minor nitpicks mentioned in the comments.

vinaydes · 2022-02-02T05:05:39Z

cpp/include/cuml/tree/decisiontree.hpp

@@ -84,13 +84,13 @@ struct DecisionTreeParams {
 *            i.e., GINI for classification or MSE for regression
 * @param[in] cfg_max_batch_size: Maximum number of nodes that can be processed
              in a batch. This is used only for batched-level algo. Default
-              value 128.
+              value 4096.


Good catch 👍

vinaydes · 2022-02-02T05:12:13Z

cpp/src/decisiontree/batched-levelalgo/builder.cuh

-    max_blocks = 1 + params.max_batch_size + input.nSampledRows / TPB_DEFAULT;
-    ASSERT(quantiles != nullptr, "Currently quantiles need to be computed before this call!");
+    max_blocks = 1 + params.max_batch_size + dataset.nSampledRows / TPB_DEFAULT;
+    ASSERT(q.quantiles_array != nullptr,


q.n_uniquebins_array should also be checked?

done vinay 👍🏾

vinaydes · 2022-02-02T05:57:07Z

cpp/src/decisiontree/batched-levelalgo/kernels/builder_kernels_impl.cuh

  // populating shared memory with initial values
  for (IdxT i = threadIdx.x; i < shared_histogram_len; i += blockDim.x)
    shared_histogram[i] = BinT();
-  for (IdxT b = threadIdx.x; b < nbins; b += blockDim.x)
-    sbins[b] = input.quantiles[col * nbins + b];
+  for (IdxT b = threadIdx.x; b < nbins; b += blockDim.x) {


Are we going for {} for single line loops or not? Because loop above does not seem to have them.

vinaydes · 2022-02-02T06:07:06Z

cpp/src/decisiontree/batched-levelalgo/kernels/quantiles.cu

                                                        const double* sorted_data,
-                                                        const int length);
+                                                        const int max_nbins,
+                                                        const int n_rows);

 }  // end namespace DT
 }  // end namespace ML


Let's add a EOF to keep git happy.

vinaydes · 2022-02-03T03:34:55Z

Good to merge from my side. Any idea about the build failures? Perhaps we need to merge latest 22.04?

codecov-commenter · 2022-02-03T12:22:00Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.04@7a18ae3). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.04    #4479   +/-   ##
===============================================
  Coverage                ?   85.71%           
===============================================
  Files                   ?      236           
  Lines                   ?    19365           
  Branches                ?        0           
===============================================
  Hits                    ?    16599           
  Misses                  ?     2766           
  Partials                ?        0

Flag	Coverage Δ
dask	`46.48% <0.00%> (?)`
non-dask	`78.63% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a18ae3...b8d92c0. Read the comment docs.

dantegd · 2022-02-03T15:37:01Z

@gpucibot merge

…)" This reverts commit 962d6f0.

* This PR enables variable bins capped to `max_n_bins` for the feature-quantiles. This makes Decision Trees more robust for a wider variety of datasets by avoiding redundant bins for columns having fewer uniques. * Added tests for the same * Some accompanying changes in naming and format of the structures used for passing input and quantiles * Deleting the `cpp/test/sg/decisiontree_batchedlevel_*` files as they are not tested. * changed param `n_bins` to `max_nbins` in C++ layer to differentiate it's meaning with the actual `n_bins` used [here](https://github.com/rapidsai/cuml/blob/eb62fecce4211a1022cf19380be31981680fc5ab/cpp/src/decisiontree/batched-levelalgo/kernels/builder_kernels_impl.cuh#L266) * The python layer still maintains `n_bins` but have tweaked the docstrings to convey that it denotes the "_maximum bins used_" * Other variable renamings in the core Decision Tree classes --- This PR does not improve perf of GBM-bench datasets by much as almost all of the features have unique values that exceed the `n_bins` used. ![comparison_gbm_main_vs_test](https://user-images.githubusercontent.com/23023424/149161138-9dc1cfea-9890-4f96-8eef-8b938e44d10c.png) Authors: - Venkat (https://github.com/venkywonka) Approvers: - Rory Mitchell (https://github.com/RAMitchell) - Vinay Deshpande (https://github.com/vinaydes) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4479

venkywonka added 10 commits December 9, 2021 13:17

initial impl of variable-binning

d1f5467

quantile post-processing done in kernel

bcce385

Merge branch 'branch-22.02' into enh-rf-variable-binning

fd0825d

user-driven prototypes using constant memory

3191b55

add multi-stream quantile sorting

6c16fbb

clean dev code

76ac2dd

cleaning dev code

98d1122

clang format'

4c55450

Merge branch 'branch-22.02' into enh-rf-variable-binning

82c4839

clang fix + nvtx update

bda8090

github-actions bot added CMake CUDA/C++ labels Jan 12, 2022

dantegd requested a review from RAMitchell January 12, 2022 16:08

venkywonka changed the title ~~[ENH] RF: Variable binning and stream-parallel quantile computation~~ [WIP] RF: Variable binning and stream-parallel quantile computation Jan 13, 2022

remove commented code + revert CMakeLists

8592673

github-actions bot removed the CMake label Jan 13, 2022

RAMitchell reviewed Jan 14, 2022

View reviewed changes

venkywonka added 4 commits January 17, 2022 10:29

use user-defined streams for sorting

0cb1e49

Merge branch 'branch-22.02' into enh-rf-variable-binning

d18ddf0

review changes

c1b1636

clang format

bd47b26

venkywonka added the non-breaking Non-breaking change label Jan 17, 2022

venkywonka added 3 commits January 17, 2022 20:27

remove comments

eac966f

change name: 'input' to 'dataset'

5d280cf

delete input.h

78f8cbe

RAMitchell reviewed Jan 18, 2022

View reviewed changes

cpp/src/decisiontree/batched-levelalgo/quantiles.cuh Outdated Show resolved Hide resolved

cpp/src/decisiontree/batched-levelalgo/quantiles.cuh Outdated Show resolved Hide resolved

cpp/src/randomforest/randomforest.cuh Outdated Show resolved Hide resolved

remove stream-parallelism and other changes

1159e95

venkywonka changed the title ~~[WIP] RF: Variable binning and stream-parallel quantile computation~~ [WIP] RF: Variable binning Jan 18, 2022

Merge branch 'branch-22.02' into enh-rf-variable-binning

2b89113

Merge branch 'branch-22.02' into enh-rf-variable-binning

cb7caa2

venkywonka changed the base branch from branch-22.02 to branch-22.04 January 24, 2022 04:54

venkywonka added 2 commits January 24, 2022 22:56

Merge branch 'branch-22.04' into enh-rf-variable-binning

22a04bd

change the in/out params of

3d49e4c

venkywonka marked this pull request as ready for review January 24, 2022 18:34

venkywonka requested a review from a team as a code owner January 24, 2022 18:34

venkywonka changed the title ~~[WIP] RF: Variable binning~~ [REVIEW] RF: Variable binning Jan 24, 2022

venkywonka added the improvement Improvement / enhancement to an existing function label Jan 24, 2022

venkywonka added 2 commits January 25, 2022 00:15

copyright

c9d919c

change to and other small fixes

eb62fec

github-actions bot added the Cython / Python Cython or Python issue label Jan 28, 2022

copyright and include fix

93161a9

venkywonka requested a review from a team as a code owner January 28, 2022 15:24

RAMitchell approved these changes Jan 28, 2022

View reviewed changes

review changes

5a63535

vinaydes approved these changes Feb 2, 2022

View reviewed changes

adding tests, variable-renaming, review comment changes

13567d0

venkywonka changed the title ~~[REVIEW] RF: Variable binning~~ [REVIEW] RF: Variable binning and other minor refactoring Feb 2, 2022

add srand

b8d92c0

dantegd approved these changes Feb 3, 2022

View reviewed changes

rapids-bot bot merged commit 962d6f0 into rapidsai:branch-22.04 Feb 3, 2022

cjnolet added a commit to cjnolet/cuml that referenced this pull request Feb 4, 2022

Revert "RF: Variable binning and other minor refactoring (rapidsai#4479…

2d49ac2

…)" This reverts commit 962d6f0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] RF: Variable binning and other minor refactoring #4479

[REVIEW] RF: Variable binning and other minor refactoring #4479

venkywonka commented Jan 12, 2022 •

edited

Loading

RAMitchell left a comment

RAMitchell Jan 14, 2022

venkywonka Jan 17, 2022

venkywonka commented Jan 17, 2022 •

edited

Loading

RAMitchell left a comment

RAMitchell Jan 28, 2022

RAMitchell Jan 28, 2022

venkywonka Feb 1, 2022

RAMitchell Jan 28, 2022

venkywonka Feb 1, 2022

RAMitchell Jan 28, 2022

venkywonka Feb 1, 2022

RAMitchell Jan 28, 2022

RAMitchell Jan 28, 2022

vinaydes left a comment

vinaydes Feb 2, 2022

vinaydes Feb 2, 2022

venkywonka Feb 2, 2022

vinaydes Feb 2, 2022

vinaydes Feb 2, 2022

vinaydes commented Feb 3, 2022

codecov-commenter commented Feb 3, 2022

dantegd commented Feb 3, 2022

[REVIEW] RF: Variable binning and other minor refactoring #4479

[REVIEW] RF: Variable binning and other minor refactoring #4479

Conversation

venkywonka commented Jan 12, 2022 • edited Loading

RAMitchell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

venkywonka commented Jan 17, 2022 • edited Loading

RAMitchell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vinaydes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vinaydes commented Feb 3, 2022

codecov-commenter commented Feb 3, 2022

Codecov Report

dantegd commented Feb 3, 2022

venkywonka commented Jan 12, 2022 •

edited

Loading

venkywonka commented Jan 17, 2022 •

edited

Loading