Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rllib] Envs for vectorized execution, async execution, and policy serving #2170

Merged
merged 85 commits into from
Jun 18, 2018

Conversation

ericl
Copy link
Contributor

@ericl ericl commented May 31, 2018

What do these changes do?

Vectorized envs: Users can either implement VectorEnv, or alternatively set num_envs=N to auto-vectorize gym envs (this vectorizes just the action computation part).

# CartPole-v0 on single core with 64x64 MLP:

# vector_width=1:
Actions per second 2720.1284458322966

# vector_width=8:
Actions per second 13773.035334888269

# vector_width=64:
Actions per second 37903.20472563333

Async envs: The more general form of VectorEnv is AsyncVectorEnv, which allows agents to execute out of lockstep. We use this as an adapter to support ServingEnv. Since we can convert any other form of env to AsyncVectorEnv, utils.sampler has been rewritten to run against this interface.

Policy serving: This provides an env which is not stepped. Rather, the env executes in its own thread, querying the policy for actions via self.get_action(obs), and reporting results via self.log_returns(rewards). We also support logging of off-policy actions via self.log_action(obs, action). This is a more convenient API for some use cases, and also provides parallelizable support for policy serving (for example, if you start a HTTP server in the env) and ingest of offline logs (if the env reads from serving logs).

Any of these types of envs can be passed to RLlib agents. RLlib handles conversions internally in CommonPolicyEvaluator, for example:

       gym.Env => rllib.VectorEnv => rllib.AsyncVectorEnv
       rllib.ServingEnv => rllib.AsyncVectorEnv

TODO:

Related issue number

#2053

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6060/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6067/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6068/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6076/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6077/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6079/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6080/
Test FAILed.

@ericl
Copy link
Contributor Author

ericl commented Jun 16, 2018

jenkins retest this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6082/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6085/
Test FAILed.

@ericl
Copy link
Contributor Author

ericl commented Jun 17, 2018

jenkins retest this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6088/
Test FAILed.

@ericl ericl force-pushed the v2-vectorization branch from 06b568a to e191e82 Compare June 17, 2018 23:54
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6089/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6092/
Test PASSed.

@ericl ericl merged commit 7dee2c6 into ray-project:master Jun 18, 2018
royf added a commit to royf/ray that referenced this pull request Jun 22, 2018
* 'master' of https://github.com/ray-project/ray: (157 commits)
  Fix build failure while using make -j1. Issue 2257 (ray-project#2279)
  Cast locator with index type (ray-project#2274)
  fixing zero length partitions (ray-project#2237)
  Make actor handles work in Python mode. (ray-project#2283)
  [xray] Add error table and push error messages to driver through node manager. (ray-project#2256)
  addressing comments (ray-project#2210)
  Re-enable some actor tests. (ray-project#2276)
  Experimental: enable automatic GCS flushing with configurable policy. (ray-project#2266)
  [xray] Sets good object manager defaults. (ray-project#2255)
  [tune] Update Trainable doc to expose interface (ray-project#2272)
  [rllib] Add a simple REST policy server and client example (ray-project#2232)
  [asv] Pushing to s3 (ray-project#2246)
  [rllib] Remove need to pass around registry (ray-project#2250)
  Support multiple availability zones in AWS (fix ray-project#2177) (ray-project#2254)
  [rllib] Add squash_to_range model option (ray-project#2239)
  Mitigate randomly building failure: adding gen_local_scheduler_fbs to raylet lib. (ray-project#2271)
  [rllib] Refactor Multi-GPU for PPO (ray-project#1646)
  [rllib] Envs for vectorized execution, async execution, and policy serving (ray-project#2170)
  [Dataframe] Change pandas and ray.dataframe imports (ray-project#1942)
  [Java] Replace binary rewrite with Remote Lambda Cache (SerdeLambda) (ray-project#2245)
  ...
@ericl ericl mentioned this pull request Jul 7, 2018
@robertnishihara robertnishihara deleted the v2-vectorization branch July 7, 2018 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants