Skip to content

[python-package] predict_proba not working properly when num_iteration is greater than the best iteration from early stopping #6687

Open

Description

Description

When using LGBMClassifier to train a model that is stopped by early stopping and the predict_proba method is called with a value for 'num_iteration' greater than the value stored in the model's 'best_iteration_' attribute, the method appears to still use the number of iterations specified by 'best_iteration_', rather than the number passed in the parameter.

Reproducible example

from sklearn.datasets import load_breast_cancer
from lightgbm import LGBMClassifier, early_stopping
from sklearn.metrics import log_loss
X,y = load_breast_cancer(return_X_y = True)
lgb = LGBMClassifier(n_estimators=1000)
lgb.fit(X[:400], y[:400], eval_set = (X[400:],y[400:]), callbacks=[early_stopping(10)])
for i in range(lgb.best_iteration_-5, lgb.best_iteration_+5):
print(log_loss(y[400:], lgb.predict_proba(X[400:], num_iteration=i)))
print(lgb.evals_result_['valid_0']['binary_logloss'][lgb.best_iteration_-6:lgb.best_iteration_+4])

Environment info

LightGBM version 4.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions