[GPU] lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0), lightgbm.basic.LightGBMError: Check failed: (best_split_info.right_count) > (0)

## Description
When using LightGBM with GPU training, an error is encountered during the training process. The error specifically occurs when LightGBM attempts to split the data into leaf nodes, resulting in a split where one of the resulting nodes has zero data points. This error does not occur when using CPU training.

## Reproducible example
```python3
import numpy as np
import pandas as pd
import lightgbm as lgb
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split


def generate_synthetic_data(n_samples=10000, n_features=50):
    np.random.seed(42)
    X = np.random.rand(n_samples, n_features)
    y = np.sum(X, axis=1) + np.random.randn(n_samples) * 0.1
    return X, y


def check_data_variability(X_train, y_train):
    X_train_df = pd.DataFrame(X_train)
    y_train_series = pd.Series(y_train)

    print("X_train Feature Variability:")
    print(X_train_df.describe().transpose())
    print("\nNumber of unique values in each feature:")
    print(X_train_df.nunique())

    print("\ny_train Target Variability:")
    print(y_train_series.describe())
    print("Number of unique values in target:", y_train_series.nunique())


def initialize_gpu_model():
    params = {
        "boosting_type": "gbdt",
        "objective": "regression",
        "metric": "rmse",
        "learning_rate": 0.01,
        # "num_leaves": 15,
        # "max_depth": 5,
        # "min_child_samples": 1,
        # "min_child_weight": 1e-3,  # Align with min_child_samples
        # "min_split_gain": 0.1,
        "n_estimators": 10000,
        # "subsample": 0.1,
        # "subsample_freq": 1,
        # "colsample_bytree": 0.1,
        # "reg_alpha": 0.1,
        # "reg_lambda": 0.1,
        "verbose": 100,
        "device": "gpu",
    }
    model = lgb.LGBMRegressor(**params)
    print(model.get_params())
    return model


def main():
    X, y = generate_synthetic_data()
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    check_data_variability(X_train, y_train)

    model = initialize_gpu_model()

    model.fit(
        X_train,
        y_train,
        eval_set=[(X_test, y_test)],
        eval_metric="rmse",
    )
    y_pred = model.predict(X_test)

    rmse = np.sqrt(mean_squared_error(y_test, y_pred))
    print(f"RMSE: {rmse}")


if __name__ == "__main__":
    main()

```

## Environment info

lightgbm versions `4.2.0` and `4.3.0`

Command(s) you used to install LightGBM

```shell
$ sh build-python.sh install --gpu
```
from the release tag branches
```shell
$ cmake -DUSE_GPU=ON
```
for `lib_lightgbm.dylib`

macOS 14.4.1 (23E224)
Apple Silicon M1
Tested with python versions 3.10, 3.11, and 3.12, with and without conda
cmake version 3.29.3




## Additional Comments
The error only occurs with GPU training (device: "gpu").
The same parameters work fine when device: "cpu" is used.
Adjusting parameters like num_leaves, min_child_samples, max_depth, etc., to more conservative values did not resolve the issue.
Also, it is worth mentioning that `generate_synthetic_data(n_samples, n_features)` with `n_samples` less than ~2000 do not cause the issue. It only occurs when the data input becomes large, so subsampling can solve the issue but will significantly affect the performance and boosting as I tested.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0), lightgbm.basic.LightGBMError: Check failed: (best_split_info.right_count) > (0) #6469

Description

Reproducible example

Environment info

Additional Comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[GPU] lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0), lightgbm.basic.LightGBMError: Check failed: (best_split_info.right_count) > (0) #6469

Description

Description

Reproducible example

Environment info

Additional Comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions