-
Notifications
You must be signed in to change notification settings - Fork 409
[Autotuner] Feature: add --cpu_budget and --timeout_per_trial
#2395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
--cpu_budget and --timeout_per_trial--cpu_budget and --timeout_per_trial
|
The various ORFS stages have vastly different memory and CPU needs. How does the user characterize and balance this? |
The intended usage for these knobs is to limit experiment runtime (and consequently $$ budget). These knobs do not limit how much resources a given ORFS run has access to. CPU budget is intended to stop the experiment after the budget is spent. |
6c1fb2d to
c15d6ed
Compare
c15d6ed to
bff5e2e
Compare
vvbandeira
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an early feedback.
0f3104b to
8b599a6
Compare
|
@vvbandeira this PR is almost ready - just requires the fix from #2394 to be applied |
f1df415 to
c24ba64
Compare
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: luarss <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
… cpubudget prompt from hrs->seconds Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
- subprocess bug: check must be False to capture nonzero retcode. - METRIC key not inside best_result if timeout prematurely called before trial completes (true for smoke test) Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
- make a more reasonable # of samples to avoid jenkins worker overloading. - fix expected_timeout Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
0205ba7 to
0693b7c
Compare
Rationale
timeout_per_trialis different from overalltimeout.TODO
--cpu_budget-> verify the timeout is hit.