[BUG] Linear regression model predicts NaN values only #3210
Description
What is the bug?
I trained a linear regression model with 5000 features and apparently when calling the _predict
API only NaN
values are returned.
I cannot exclude that I'm using parameters that are not ideal and as a consequence lead to the NaN
predictions. I unsuccessfully tried smaller learning rates but did not experiment with all available parameters and parameter values.
How can one reproduce the bug?
Steps to reproduce the behavior:
- Get the features at https://gist.github.com/wrigleyDan/a83a5d8294aa0ed493e4feb8cc9d7433
- Get the notebook to see how I ingest the data, train a model, predict a value: https://gist.github.com/wrigleyDan/16deb9cd8201ec502acda036c0b150b5
- Run the notebook with the feature data
- See NaN as the predicted value
What is the expected behavior?
The expected behavior is to receive not only NaN
values but reasonable predictions, in the given example values between 0 and 1.
What is your host/environment?
- OpenSearch v 2.16.0
Do you have any screenshots?
See the linked Gist with a notebook example and the data used as features.
Do you have any additional context?
Initially reported in the #ml OpenSearch Slack channel: https://opensearch.slack.com/archives/C05BGJ1N264/p1731077205560749
Metadata
Assignees
Labels
Type
Projects
Status
Done