benchmarks/kaggle/rossmann-store-sales at master · catboost/benchmarks

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
catboost_early_stopping.py		catboost_early_stopping.py
catboost_experiment_sklearn_grid_cv.py		catboost_experiment_sklearn_grid_cv.py
catboost_experiment_sklearn_random_cv.py		catboost_experiment_sklearn_random_cv.py
config.py		config.py
experiment_hyperopt.py		experiment_hyperopt.py
experiment_lib.py		experiment_lib.py
lightgbm_early_stopping.py		lightgbm_early_stopping.py
lightgbm_experiment_sklearn_grid_cv.py		lightgbm_experiment_sklearn_grid_cv.py
lightgbm_experiment_sklearn_random_cv.py		lightgbm_experiment_sklearn_random_cv.py
preprocess_data.py		preprocess_data.py
run_all.sh		run_all.sh
xgboost_early_stopping.py		xgboost_early_stopping.py
xgboost_experiment_sklearn_grid_cv.py		xgboost_experiment_sklearn_grid_cv.py
xgboost_experiment_sklearn_random_cv.py		xgboost_experiment_sklearn_random_cv.py

README.md

Benchmark that compares quality of GBDT packages on rossman-store-sales dataset.

Results

Hyperparameters tuned with hyperopt

Number of hyperopt iterations was set to 50, final model is tuned with best hyperparameters on all train data.

Experiment	Best hyperparameters	RMSE on test
catboost with specifying cat features	best_n_estimators = 1415 params = {'random_seed': 0, 'learning_rate': 0.10663314690544494, 'iterations': 1500, 'od_wait': 100, 'one_hot_max_size': 143.0, 'bagging_temperature': 0.39933964736871874, 'random_strength': 1, 'depth': 8.0, 'loss_function': 'RMSE', 'l2_leaf_reg': 5.529962582104021, 'border_count': 254, 'boosting_type': 'Plain', 'bootstrap_type': 'Bayesian'}	489.75
lightgbm with specifying cat features	best_n_estimators = 3396 params = {'num_leaves': 63, 'max_cat_threshold': 2, 'cat_l2': 12.93150760783131, 'verbose': -1, 'bagging_seed': 3, 'max_cat_to_onehot': 2, 'learning_rate': 0.12103165638430856, 'max_delta_step': 0.0, 'data_random_seed': 1, 'cat_smooth': 4.287437698866151, 'min_data_in_leaf': 26, 'bagging_fraction': 0.6207358917316325, 'min_data_per_group': 261, 'min_sum_hessian_in_leaf': 7.515138790064522e-05, 'feature_fraction_seed': 2, 'min_gain_to_split': 0.0, 'lambda_l1': 0, 'bagging_freq': 1, 'lambda_l2': 0.1709660204090765, 'max_depth': -1, 'objective': 'mean_squared_error', 'drop_seed': 4, 'metric': 'l2', 'feature_fraction': 0.8168930995735235}	504.76
xgboost	best_n_estimators = 4011 params = {'reg_alpha': 0.14747200224681817, 'tree_method': 'gpu_hist', 'colsample_bytree': 0.883176060062088, 'silent': 1, 'eval_metric': 'rmse', 'grow_policy': 'depthwise', 'learning_rate': 0.10032091014826115, 'subsample': 0.5740170782945163, 'reg_lambda': 0, 'max_bin': 1020, 'objective': 'reg:linear', 'min_split_loss': 0, 'max_depth': 7}	490.83

Early stopping with default hyperparameters

Max iterations limit was set to 9999 and early_stopping_rounds to 100.

Note that for CatBoost results differ between CPU and GPU implementations because border_count parameter has default value 254 in CPU mode and 128 in GPU mode.

Results on CPU

CPU - Intel Xeon E312xx (Sandy Bridge) VM, 16 cores.

Experiment	Early stopping time (sec)	RMSE on test	Comments
catboost w/o specifying cat features	212.67	578.10	reached max iterations limit
catboost with specifying cat features	894.51	520.07
lightgbm w/o specifying cat features	51.17	499.67
lightgbm with specifying cat features	9.90	490.57
xgboost	272.3	567.8	reached max iterations limit

Results on GPU

GPU - 2x nVidia GeForce 1080 Ti.

Experiment	Early stopping time (sec)	RMSE on test	Comments
catboost w/o specifying cat features	39.5	575.75	reached max iterations limit
catboost with specifying cat features	90.83	528.63
lightgbm w/o specifying cat features	97.93	501.22
lightgbm with specifying cat features	n/a	n/a	Failed: [LightGBM] [Fatal] bin size 1093 cannot run on GPU, see microsoft/LightGBM#1116
xgboost in 'gpu-exact' mode	125.48	566.55	reached max iterations limit
xgboost in 'gpu-hist' mode	68.04	626.09	reached max iterations limit

Hyperparameters tuned with RandomizedSearchCV

Hyperparameter distributions:

            'n_estimators' : LogUniform(100, 1000, True),
            'max_depth' : scipy.stats.randint(low=1, high=16),
            'learning_rate' : scipy.stats.uniform(0.01, 1.0)

(see experiments_lib.py file for LogUniform definition)

Results on CPU

CPU - Intel Xeon E312xx (Sandy Bridge) VM, 16 cores.

Experiment	Time (sec)	RMSE on test
catboost w/o specifying cat features	239.91	568.38
catboost with specifying cat features	1145.98	534.13
lightgbm w/o specifying cat features	105.02	523.62
lightgbm with specifying cat features	97.94	510.53
xgboost	437.8	512.74

Requirements

OS - Linux (was tested on Ubuntu LTS 16.04)

Installed packages (via 'pip install'):

kaggle
hyperopt
numpy
pandas
scipy
scikit-learn

GBDT packages versions

Tested on:

catboost 0.11.0
lightgbm 2.2.1
xgboost 0.80

How to run

Download dataset from kaggle
Preprocess it (extract features and save in CatBoost data format)
Run benchmarks

(see 'run_all.sh' that does all these steps)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rossmann-store-sales

rossmann-store-sales

README.md

Results

Hyperparameters tuned with hyperopt

Early stopping with default hyperparameters

Results on CPU

Results on GPU

Hyperparameters tuned with RandomizedSearchCV

Results on CPU

Requirements

GBDT packages versions

How to run

Files

rossmann-store-sales

Directory actions

More options

Directory actions

More options

Latest commit

History

rossmann-store-sales

Folders and files

parent directory

README.md

Results

Hyperparameters tuned with hyperopt

Early stopping with default hyperparameters

Results on CPU

Results on GPU

Hyperparameters tuned with RandomizedSearchCV

Results on CPU

Requirements

GBDT packages versions

How to run