Closed
Description
Describe the bug
A RandomForestClassifier that is trained with float64 data segfaults when it's converted to a Treelite object (via convert_to_treelite_model()
). The conversion is successful when the tree is trained with float32 instead.
Steps/Code to reproduce bug
In the following code snippet, function1
segfaults, whereas function2
runs successfully.
from cuml.ensemble import RandomForestClassifier as curfc
from sklearn.datasets import load_iris
def function1():
X, y = load_iris(return_X_y=True)
cuml_model = curfc(max_features=1.0, max_samples=0.1, n_bins=128,
min_samples_leaf=2, random_state=123,
n_streams=1, n_estimators=10, max_leaves=-1,
max_depth=16, accuracy_metric="mse")
cuml_model.fit(X, y)
tl_model = cuml_model.convert_to_treelite_model()
def function2():
X, y = load_iris(return_X_y=True)
X, y = X.astype(np.float32), y.astype(np.int32)
cuml_model = curfc(max_features=1.0, max_samples=0.1, n_bins=128,
min_samples_leaf=2, random_state=123,
n_streams=1, n_estimators=10, max_leaves=-1,
max_depth=16, accuracy_metric="mse")
cuml_model.fit(X, y)
tl_model = cuml_model.convert_to_treelite_model()
Expected behavior
Segmentation fault should not occur. At minimum, cuML should throw an error saying that float64
trees cannot be converted to Treelite object.
Environment details (please complete the following information):
- Environment location: AWS
- Linux Distro/Architecture: Ubuntu 18.04 amd64
- GPU Model/Driver: Testa T4 and driver 470.82.00
- CUDA: 11.4
- Method of cuDF & cuML install: from source, commit hash 34f7929