Skip to content

[python-package] remove is_reshape in Booster.predict() #5115

Closed
@jameslamb

Description

Summary

Argument is_reshape should be removed from Booster.predict() in the Python package.

In all cases where LightGBM returns more than one predicted value per row in the input data, Booster.predict() should just return a 2D matrix with num_rows rows.

Motivation

Would simplify the prediction code, reducing the risk of mistakes made by refactorings.

Would simplify the public API of the Python package, making it easier for other code that wraps lightgbm to integrate with it.

Now, when there is an impending v4.0.0 release with many breaking changes, is a good opportunity to make such changes, since users upgrading to 4.0.0 will have to deal with other breaking changes already.

Description

For some tasks, LightGBM's predict() routine produces a single predicted value per row in the input data. In those situations, the Python package returns a 1D array.

In other situations, predict() produces more than one number per row in the input data. For example, in multiclass classification, for each row in the input data predict() produces one class probability or raw value of the objective function per class in the target.

In such situations, it's often desirable to work with those predictions as a (num_rows, num_classes) matrix, instead of as a 1-dimensional row-major array.

That is the Python package's default behavior, but it can be overridden by passing argument is_reshape to Booster.predict().

data_has_header=False, is_reshape=True, **kwargs):

When that is set to True (the default), lightgbm converts the predictions into such a matrix.

if is_reshape and not is_sparse and preds.size != nrow:
if preds.size % nrow == 0:
preds = preds.reshape(nrow, -1)
else:
raise ValueError(f'Length of predict result ({preds.size}) cannot be divide nrow ({nrow})')

If set to False, lightgbm will return a row-major 1D vector.

References

Created based on similar changes in the R package, #4971.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions