Description
Summary
Argument is_reshape
should be removed from Booster.predict()
in the Python package.
In all cases where LightGBM returns more than one predicted value per row in the input data, Booster.predict()
should just return a 2D matrix with num_rows
rows.
Motivation
Would simplify the prediction code, reducing the risk of mistakes made by refactorings.
Would simplify the public API of the Python package, making it easier for other code that wraps lightgbm
to integrate with it.
Now, when there is an impending v4.0.0 release with many breaking changes, is a good opportunity to make such changes, since users upgrading to 4.0.0 will have to deal with other breaking changes already.
Description
For some tasks, LightGBM's predict()
routine produces a single predicted value per row in the input data. In those situations, the Python package returns a 1D array.
In other situations, predict()
produces more than one number per row in the input data. For example, in multiclass classification, for each row in the input data predict()
produces one class probability or raw value of the objective function per class in the target.
In such situations, it's often desirable to work with those predictions as a (num_rows, num_classes)
matrix, instead of as a 1-dimensional row-major array.
That is the Python package's default behavior, but it can be overridden by passing argument is_reshape
to Booster.predict()
.
LightGBM/python-package/lightgbm/basic.py
Line 3492 in 33eb037
When that is set to True
(the default), lightgbm
converts the predictions into such a matrix.
LightGBM/python-package/lightgbm/basic.py
Lines 835 to 839 in 33eb037
If set to False
, lightgbm
will return a row-major 1D vector.
References
Created based on similar changes in the R package, #4971.