Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune] HyperOpt Support (v2) #1763

Merged
merged 41 commits into from
Apr 4, 2018
Merged

Conversation

richardliaw
Copy link
Contributor

@richardliaw richardliaw commented Mar 21, 2018

  • Tests:
    • Fix jenkins by creating local docker image of jenkins env
  • Clarify comment for verbose
  • Docs
  • Document necessity for this pip install --upgrade git+git://github.com/hyperopt/hyperopt.git

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4459/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4469/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4564/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4567/
Test PASSed.

@richardliaw richardliaw changed the title [tune][wip] HyperOpt Support (v2) [tune] HyperOpt Support (v2) Mar 31, 2018
@richardliaw
Copy link
Contributor Author

screenshot 2018-03-30 22 21 36

@richardliaw richardliaw requested a review from ericl March 31, 2018 06:05
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4570/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4571/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4572/
Test PASSed.

Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, a bunch of clarification comments.

self.spec = spec


class JSONExperiment(Experiment):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be a function that returns an Experiment object instead, e.g.,

Experiment.from_json(name, spec)

try:
import hyperopt as hpo
except Exception as e:
print("HyperOpt not found; perhaps not installed?")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to make this a soft dep, set hpo = None, and check this in the HyperOptScheduler constructor

from ray.tune.variant_generator import to_argv


class HyperOptScheduler(FIFOScheduler):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class doc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

class HyperOptScheduler(FIFOScheduler):

def __init__(self, max_concurrent=10, loss_attr="mean_loss"):
self._max_concurrent = max_concurrent # NOTE: this is modified later
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we always launch as many trials at once as resources permits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily; the more concurrent trials the less effective your search. For example, SigOpt only allows for 10 concurrent trials.

Now, for this particular case it could make sense to default to as many as resource permits.

self._loss_attr = loss_attr
self._experiment = None

def track_experiment(self, experiment, trial_runner):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_experiment()

try:
trial_runner.add_trial(next(generator))
except StopIteration:
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this into TrialScheduler?

Copy link
Contributor Author

@richardliaw richardliaw Mar 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no strong opinion about this; is there a benefit to keeping TrialScheduler as an interface as opposed to turning it into an abstract class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem particularly specific to fifo scheduler. I imagine most schedulers will want to extend this instead of fifoscheduler.

Speaking of which should HyperOpt extend the scheduler interface instead of fifoscheduler?

Copy link
Contributor Author

@richardliaw richardliaw Apr 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this method.

I think wrt extending HyperOpt it's fine to extend fifo, as it does call FIFOScheduler methods for on_trial_add and also for choose_trial_to_run.

That being said, the choose_trial_to_run invocation is done as a weird class method invocation rather than a super invocation and could work either way, so I'm not married to anything...

@@ -47,11 +48,18 @@ def on_trial_remove(self, trial_runner, trial):

raise NotImplementedError

def choose_trial_to_run(self, trial_runner, trials):
def track_experiment(self, experiment, trial_runner):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_experiment

@@ -47,11 +48,18 @@ def on_trial_remove(self, trial_runner, trial):

raise NotImplementedError

def choose_trial_to_run(self, trial_runner, trials):
def track_experiment(self, experiment, trial_runner):
"""Tracks experiment from which trials and generated and queued."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""Adds an experiment to the scheduler. The scheduler is responsible for adding the trials of the experiment to the runner, which can be done immediately (if there are a finite set of trials), or over time (if there is an infinite stream of trials)."""

"""Called to choose a new trial to run.

This should return one of the trials in trial_runner that is in
the PENDING or PAUSED state."""
the PENDING or PAUSED state. This is assumed to be idempotent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function must be idempotent.

try:
_SCHEDULERS[key] = get_scheduler(key)
except ImportError as e:
print("Could not import scheduler: {}".format(key))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just for HyperOpt, let's just have that fail in the constructor instead of on import.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4573/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4581/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4583/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4619/
Test FAILed.

@richardliaw
Copy link
Contributor Author

retest this please

1 similar comment
@richardliaw
Copy link
Contributor Author

retest this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4625/
Test FAILed.

@richardliaw
Copy link
Contributor Author

retest this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4626/
Test FAILed.

@richardliaw
Copy link
Contributor Author

retest this please

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4634/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4639/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4643/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4649/
Test PASSed.

@richardliaw
Copy link
Contributor Author

richardliaw commented Apr 4, 2018

travis completely (!) passes on private branch.

This also subsumes #1823, so waiting on approval there.

@robertnishihara
Copy link
Collaborator

What! This is unheard of :)

@richardliaw richardliaw merged commit 888e70f into ray-project:master Apr 4, 2018
@richardliaw richardliaw deleted the hyperopt2 branch April 4, 2018 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants