Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune] Fix performance issue and fix reuse tests #4379

Merged
merged 8 commits into from
Mar 16, 2019

Conversation

richardliaw
Copy link
Contributor

@richardliaw richardliaw commented Mar 15, 2019

Pytest I think uses some multi-threaded importing, so
classes often fail to import, hence the hanging on Jenkins. This gets around that
but using a constructor.

Performance issue reported earlier is also mitigated by only updating resources
once per TrialRunner step (Fixes #4366).

This also avoids a recurring warning.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/92/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12889/
Test FAILed.

@richardliaw richardliaw requested a review from ericl March 15, 2019 23:44
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/99/
Test PASSed.

self.config = config
self.num_resets = 0
self.iter = 0
def create_resettable_class():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed because pytest does not import classes correctly

self._resources_initialized = True

def has_resources(self, resources):
"""Returns whether this runner has at least the specified resources."""
self._update_avail_resources()
if time.time() - self._last_resource_refresh > self._refresh_period:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that if my trials run really fast (say 10ms) I won't be able to schedule more than 2 a second?

Why not refresh after each event processed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, IIUC this is just for the "cluster resources", which are not expected to change dynamically often. If that's right then it seems ok, but I would document this.

return ("Unknown memory usage. Please run `pip install psutil` "
global _MEMORY_USAGE_WARNED
if not _MEMORY_USAGE_WARNED:
_MEMORY_USAGE_WARNED = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider reverting this change. The user can easily miss this warning otherwise since it's relatively unobtrusive.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12909/
Test PASSed.

@richardliaw richardliaw requested a review from ericl March 16, 2019 05:55
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/105/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12916/
Test FAILed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[tune] Performance collapses with thousands of queued jobs
3 participants