Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] wrapping cost function to return best value explored #33

Merged

Conversation

VolodyaCO
Copy link
Contributor

@VolodyaCO VolodyaCO commented Jan 2, 2023

Description

Closes #29

Please verify that you have completed the following steps

  • I have self-reviewed my code.
  • I have included test cases validating introduced feature/fix.
  • I have updated documentation.

@VolodyaCO VolodyaCO added bug Something isn't working enhancement New feature or request labels Jan 2, 2023
@VolodyaCO VolodyaCO self-assigned this Jan 2, 2023
@VolodyaCO VolodyaCO marked this pull request as ready for review January 3, 2023 12:46
@VolodyaCO
Copy link
Contributor Author

It looks like mypy is failing due to this: python/mypy#3060. Not sure what to do 🤷‍♂️

@codecov-commenter
Copy link

codecov-commenter commented Jan 3, 2023

Codecov Report

Patch coverage: 90.16% and project coverage change: -0.78 ⚠️

Comparison is base (76e96c9) 94.95% compared to head (6e6a611) 94.17%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #33      +/-   ##
==========================================
- Coverage   94.95%   94.17%   -0.78%     
==========================================
  Files          28       29       +1     
  Lines        1011     1185     +174     
==========================================
+ Hits          960     1116     +156     
- Misses         51       69      +18     
Impacted Files Coverage Δ
...rquestra/opt/optimizers/simple_gradient_descent.py 78.43% <62.06%> (-21.57%) ⬇️
src/orquestra/opt/optimizers/scipy_optimizer.py 93.33% <90.56%> (-6.67%) ⬇️
...orquestra/opt/optimizers/tensor_train_optimizer.py 97.97% <97.97%> (ø)
...tra/opt/optimizers/pso/continuous_pso_optimizer.py 83.87% <100.00%> (-0.34%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@VolodyaCO VolodyaCO requested review from AthenaCaesura and removed request for mstechly January 13, 2023 14:18
@AthenaCaesura
Copy link
Contributor

AthenaCaesura commented Jan 13, 2023

Can we add an option for the user to go back to the old way of using the scipy optimizer? People might get confused if we don't 😅

src/orquestra/opt/optimizers/scipy_optimizer.py Outdated Show resolved Hide resolved
Comment on lines 20 to 29
class _CostFunctionWithBestValueType(type):
def __instancecheck__(cls, obj: object) -> bool:
if (
isinstance(obj, Callable)
and hasattr(obj, "best_value")
and hasattr(obj, "best_params")
):
return True
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice little piece of code! But I'm afraid it might be needlessly opaque since isinstance is only used once here and just for an assert statement. metaclasses are considered "heavy machinery" so it's best not to use them when they could be replaced with a simple if statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove this to check the new instance check, I get some linting problems 😢

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but since you only use this isinstance once, it's not really worth changing the instancecheck method right?

Copy link
Contributor Author

@VolodyaCO VolodyaCO May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to do this trick for the tensor train optimiser PR too #31. Do you want me to instead have a custom isinstanceCostFunctionWithBestValue function, or something like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's probably fine. Not worth holding up the PR for this.

src/orquestra/opt/optimizers/scipy_optimizer.py Outdated Show resolved Hide resolved
src/orquestra/opt/optimizers/scipy_optimizer.py Outdated Show resolved Hide resolved
@VolodyaCO
Copy link
Contributor Author

I'm using this in a GEO workflow but it seems as though it isn't working as expected. Here are the logs from the workflow:

{"timestamp": "2023-01-25 08:32:15,166", "level": "INFO", "filename": "history_selector.py:102", "message": Number of pre-selected candidates 398. The radius is 0.0390625}
{"timestamp": "2023-01-25 08:32:15,298", "level": "INFO", "filename": "base.py:769", "message": Probs sum is: 1.0000000000000022}
{"timestamp": "2023-01-25 08:32:15,298", "level": "INFO", "filename": "base.py:770", "message": xtrain size is: 7924}
{"timestamp": "2023-01-25 08:32:15,298", "level": "INFO", "filename": "base.py:771", "message": Number of observations are: 7924}
{"timestamp": "2023-01-25 08:32:15,298", "level": "INFO", "filename": "base.py:772", "message": Minimum cost is: -8.103484550314727}
Composition labels in HDBSCAN clustering:
Counter({-1: 4496, 4: 1412, 8: 838, 7: 283, 1: 197, 5: 194, 6: 161, 0: 121, 3: 113, 2: 109})
Cleared cache for LangerInCorners(m=4) in 2 dimensions and seed 0
{"timestamp": "2023-01-25 08:32:16,405", "level": "INFO", "filename": "geo_workflows.py:315", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Finished in 0.024 seconds"}}
{"timestamp": "2023-01-25 08:32:16,405", "level": "INFO", "filename": "geo_workflows.py:316", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Optimum value: -7.866315123621101"}}
{"timestamp": "2023-01-25 08:32:16,433", "level": "INFO", "filename": "local_workflows.py:62", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :::: Cleaned history from length 30 to length 1. dl = 0.1810866855724538"}}
Cleared cache for LangerInCorners(m=4) in 2 dimensions and seed 0
{"timestamp": "2023-01-25 08:32:17,202", "level": "INFO", "filename": "geo_workflows.py:315", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Finished in 0.025 seconds"}}
{"timestamp": "2023-01-25 08:32:17,203", "level": "INFO", "filename": "geo_workflows.py:316", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Optimum value: -7.8663151681677315"}}
{"timestamp": "2023-01-25 08:32:17,227", "level": "INFO", "filename": "local_workflows.py:62", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :::: Cleaned history from length 67 to length 2. dl = 0.17779190436107137"}}
Cleared cache for LangerInCorners(m=4) in 2 dimensions and seed 0
{"timestamp": "2023-01-25 08:32:17,966", "level": "INFO", "filename": "geo_workflows.py:315", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Finished in 0.024 seconds"}}
{"timestamp": "2023-01-25 08:32:17,966", "level": "INFO", "filename": "geo_workflows.py:316", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Optimum value: -7.866315173534457"}}
{"timestamp": "2023-01-25 08:32:17,992", "level": "INFO", "filename": "local_workflows.py:62", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :::: Cleaned history from length 67 to length 1. dl = 0.17560887003556275"}}
Cleared cache for LangerInCorners(m=4) in 2 dimensions and seed 0
{"timestamp": "2023-01-25 08:32:18,650", "level": "INFO", "filename": "geo_workflows.py:315", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Finished in 0.024 seconds"}}
{"timestamp": "2023-01-25 08:32:18,651", "level": "INFO", "filename": "geo_workflows.py:316", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Optimum value: -7.856305814156559"}}
{"timestamp": "2023-01-25 08:32:18,680", "level": "INFO", "filename": "local_workflows.py:62", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :::: Cleaned history from length 62 to length 2. dl = 0.17537807662467564"}}
Cleared cache for LangerInCorners(m=4) in 2 dimensions and seed 0
{"timestamp": "2023-01-25 08:32:19,406", "level": "INFO", "filename": "geo_workflows.py:315", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Finished in 0.024 seconds"}}
{"timestamp": "2023-01-25 08:32:19,407", "level": "INFO", "filename": "geo_workflows.py:316", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :: Optimum value: -7.856305810418254"}}
{"timestamp": "2023-01-25 08:32:19,431", "level": "INFO", "filename": "local_workflows.py:62", "message": {"run_id": "wf.local_run.0000000@tsk.task_local-0.00000", "logs": "     :::: Cleaned history from length 60 to length 2. dl = 0.17537775444008236"}}

From GEO optimisation we get a minimum of -8.something. The point corresponding to this value in the cost function is set as the initial parameter of one of the local optimisation runs, which return as best value -7.8something. This is not the expected behaviour.

@AthenaCaesura
Copy link
Contributor

AthenaCaesura commented Mar 15, 2023

Let me know if you need any help with debugging and GEO to get the behavior you're expecting! Although I'm not sure how much help I'll be!

@VolodyaCO VolodyaCO force-pushed the fix/#29/wrap-cost-function-to-return-best-val-scipy-opt branch from 3c84347 to 9818a7a Compare May 2, 2023 12:51
@VolodyaCO VolodyaCO merged commit 46cd653 into main Jun 2, 2023
@VolodyaCO VolodyaCO deleted the fix/#29/wrap-cost-function-to-return-best-val-scipy-opt branch June 2, 2023 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ScipyOptimizer must keep the lowest value of the function due to scipy misbehaviour
3 participants