Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update run.py in machine_learning/forecasting #8957

Merged
merged 4 commits into from
Aug 14, 2023

Conversation

p1utoze
Copy link
Contributor

@p1utoze p1utoze commented Aug 14, 2023

…tion

Describe your change:

Fixes [#7305]

I have modified the run.py and it runs without any errors locally.

  • Removed headers argument from read_csv as it caused errors in Normalization.
  • Renamed the train and test column variables as it wasn't easily comprehensible
  • Added a Type check condition in case of passing wrong type as arguments in data_safety_checker function.
  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
  • If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: "Fixes #ISSUE-NUMBER".

@tianyizheng02
Copy link
Contributor

Your PR doesn't actually fix #7305 because the Pytest warnings are still present in the build logs:

machine_learning/forecasting/run.py::machine_learning.forecasting.run.sarimax_predictor
  /opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for ARMA and trend. All parameters except for variances will be set to zeros.
    warn('Too few observations to estimate starting parameters%s.'

machine_learning/forecasting/run.py::machine_learning.forecasting.run.sarimax_predictor
  /opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/site-packages/statsmodels/tsa/statespace/sarimax.py:866: UserWarning: Too few observations to estimate starting parameters for seasonal ARMA. All parameters except for variances will be set to zeros.
    warn('Too few observations to estimate starting parameters%s.'

machine_learning/forecasting/run.py::machine_learning.forecasting.run.sarimax_predictor
  /opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/site-packages/numpy/core/fromnumeric.py:3747: RuntimeWarning: Degrees of freedom <= 0 for slice
    return _methods._var(a, axis=axis, dtype=dtype, out=out, ddof=ddof,

machine_learning/forecasting/run.py::machine_learning.forecasting.run.sarimax_predictor
  /opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/site-packages/numpy/core/_methods.py:226: RuntimeWarning: invalid value encountered in divide
    arrmean = um.true_divide(arrmean, div, out=arrmean,

machine_learning/forecasting/run.py::machine_learning.forecasting.run.sarimax_predictor
  /opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/site-packages/numpy/core/_methods.py:261: RuntimeWarning: invalid value encountered in scalar divide
    ret = ret.dtype.type(ret / rcount)

However, your changes are still good, so we can merge this PR anyway after some minor tweaks

Copy link
Contributor

@tianyizheng02 tianyizheng02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fixes, just need to clean up the old code and it should be good

Comment on lines 122 to 125
# data_input = [[18231, 0.0, 1], [22621, 1.0, 2], [15675, 0.0, 3], [23583, 1.0, 4]]
# data_input_df = pd.DataFrame(
# data_input, columns=["total_user", "total_even", "days"]
# )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# data_input = [[18231, 0.0, 1], [22621, 1.0, 2], [15675, 0.0, 3], [23583, 1.0, 4]]
# data_input_df = pd.DataFrame(
# data_input, columns=["total_user", "total_even", "days"]
# )

Let's just delete the old code rather than commenting it out

@algorithms-keeper algorithms-keeper bot added the awaiting reviews This PR is ready to be reviewed label Aug 14, 2023
@p1utoze
Copy link
Contributor Author

p1utoze commented Aug 14, 2023

Sure, I'll update the code formatting.
Regarding the Pytest Warnings, I looked into the source of the error and it might be due to ARIMA deprecation.
Two approaches from mine would be to either transform the data passed to SARIMAX model for estimation or supress the warnings should be well enough.

Copy link
Contributor

@tianyizheng02 tianyizheng02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tianyizheng02 tianyizheng02 merged commit c290dd6 into TheAlgorithms:master Aug 14, 2023
@algorithms-keeper algorithms-keeper bot removed the awaiting reviews This PR is ready to be reviewed label Aug 14, 2023
@tianyizheng02
Copy link
Contributor

tianyizheng02 commented Aug 14, 2023

Sure, I'll update the code formatting. Regarding the Pytest Warnings, I looked into the source of the error and it might be due to ARIMA deprecation. Two approaches from mine would be to either transform the data passed to SARIMAX model for estimation or supress the warnings should be well enough.

@p1utoze At first glance, it looks to me like some of the errors are caused by the small number of observations (which makes sense, the ARIMA models probably need way more observations to work well). Maybe reworking this file to use a different CSV file (one with much more data) might fix some if not all of the warnings.

That's all work for a future PR, though.

@p1utoze
Copy link
Contributor Author

p1utoze commented Aug 14, 2023

Hey! I finally fixed the warnings. The numpy Runtime warning was due to the 0 in seasonal_order tuple and regarding the ARIMA models. Yes you are right, it needs more observations to work well and the User Warning are raised because of the sample test >>> sarimax_predictor([4,2,6,8], [3,1,2,4], [2]) used by doctests. Supressing it should be fine enough. Shall I create another PR then?

@tianyizheng02
Copy link
Contributor

Yeah, that'd be great, thanks!

@p1utoze p1utoze deleted the branch branch August 14, 2023 07:36
sedatguzelsemme pushed a commit to sedatguzelsemme/Python that referenced this pull request Sep 15, 2024
* Fixed reading CSV file, added type check for data_safety_checker function

* Formatted run.py

* updating DIRECTORY.md

---------

Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com>
@isidroas isidroas mentioned this pull request Jan 25, 2025
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants