[RLlib] Checkpoint and restore connectors. #26253

gjoliver · 2022-07-01T07:08:08Z

Plus a couple of examples showing the usage of connector enabled policies.

Why are these changes needed?

Allow checkpoint and restore of connector pipelines.

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- [*] Unit tests
- Release tests
- This PR is not tested :(

sven1977 · 2022-07-01T09:22:03Z

rllib/connectors/connector.py

@@ -341,6 +343,11 @@ def insert_before(self, name: str, connector: Connector):
            raise ValueError(f"Can not find connector {name}")
        self.connectors.insert(idx, connector)

+        print(


sven1977 · 2022-07-01T09:22:07Z

rllib/connectors/connector.py

@@ -357,6 +364,11 @@ def insert_after(self, name: str, connector: Connector):
            raise ValueError(f"Can not find connector {name}")
        self.connectors.insert(idx + 1, connector)

+        print(


sven1977 · 2022-07-01T09:22:09Z

rllib/connectors/connector.py

@@ -365,6 +377,11 @@ def prepend(self, connector: Connector):
        """
        self.connectors.insert(0, connector)

+        print(


sven1977 · 2022-07-01T09:22:16Z

rllib/connectors/connector.py

@@ -373,6 +390,11 @@ def append(self, connector: Connector):
        """
        self.connectors.append(connector)

+        print(


sven1977 · 2022-07-01T09:24:07Z

rllib/examples/connectors/adapt_connector_policy.py

@@ -0,0 +1,118 @@
+"""This example script shows how to load a connector enabled policy,
+and adapt/use it with a different version of environment.


nit: "the environment"

sven1977 · 2022-07-01T09:24:44Z

rllib/examples/connectors/run_connector_policy.py

@@ -0,0 +1,54 @@
+"""This example script shows how to load a connector enabled policy,
+and adapt and use it with a different version of environment.


nit: "the environment" :)

updated the comment.
thanks.

sven1977 · 2022-07-01T09:25:37Z

rllib/policy/policy.py

@@ -740,8 +740,44 @@ def get_state(self) -> PolicyState:
            # The current global timestep.
            "global_timestep": self.global_timestep,
        }
+        if self.config.get("enable_connectors", False):


Can you add a one-line comment here? that we are adding the connector state?

sven1977 · 2022-07-01T09:26:00Z

rllib/policy/policy.py

        return state

+    @ExperimentalAPI


Should we use PublicApi(alpha) here as well?

sven1977 · 2022-07-01T09:26:37Z

rllib/policy/policy.py

+            state: The new state to set this policy to. Can be
+                obtained by calling `self.get_state()`.
+        """
+        # To avoid a circular dependency problem cause by SampleBatch.


Note to future ourselves: We should move SampleBatch out of the policy folder. It doesn't belong there.

yeah, totally, SampleBatch should be a pretty low level util that depends on nothing.

sven1977 · 2022-07-01T09:27:13Z

rllib/policy/policy.py

+            self.agent_connectors = restore_connectors_for_policy(
+                self, connector_configs["agent"]
+            )
+            print("restoring agent connectors:")


logger.info

sven1977 · 2022-07-01T09:27:18Z

rllib/policy/policy.py

+                self, connector_configs["action"]
+            )
+            print("restoring action connectors:")
+            print(self.action_connectors.__str__(indentation=4))


logger.info

sven1977 · 2022-07-01T09:27:51Z

rllib/policy/policy_map.py

-        else:
-            class_ = policy_cls
-            self[policy_id] = class_(observation_space, action_space, merged_config)
+        _class = get_tf_eager_cls_if_necessary(policy_cls, merged_config)


nice! thanks for cleaning this up and creating the utility function

sven1977

Awesome PR @gjoliver , thanks for the examples on connectors.
Just a few nits, then we can merge.

Oh, one last thing, we should add the new examples to BUILD!

gjoliver · 2022-07-07T08:27:35Z

ok, addressed all the comments. let's see if CI is happy.
completely understand your comments about logger.info(), was a bit hesitant because you often can't see log files on disks on Anyscale :).

Also added all the examples as unit tests.
I also added a multi-agent example, where we restore a TF PPO policy to train a new Torch SAC policy.

Plus a couple of examples showing the usage of connector enabled policies.

* master: (42 commits) [dashboard][2/2] Add endpoints to dashboard and dashboard_agent for liveness check of raylet and gcs (ray-project#26408) [Doc] Fix docs feedback button (ray-project#26402) [core][1/2] Improve liveness check in GCS (ray-project#26405) [RLlib] Checkpoint and restore connectors. (ray-project#26253) [Workflow] Minor refactoring of workflow exceptions (ray-project#26398) [workflow] Workflow queue (ray-project#24697) [RLlib] Minor simplification of code. (ray-project#26312) [AIR] Update TensorflowPredictor to new API (ray-project#26215) [RLlib] Make Dataset reader default reader and enable CRR to use dataset (ray-project#26304) [runtime_env] [doc] Remove outdated info about "isolated" environment (ray-project#26314) [Doc] Fix rate-the-docs plugin (ray-project#26384) [Docs] [Serve] Has a consistent landing page style (ray-project#26029) [dashboard] Add `RAY_CLUSTER_ACTIVITY_HOOK` to `/api/component_activities` (ray-project#26297) [tune] Use `Checkpoint.to_bytes()` for store_to_object (ray-project#25805) [tune] Fix `SyncerCallback` having a size limit (ray-project#26371) [air] Serialize additional files in dict checkpoints turned dir checkpoints (ray-project#26351) [Docs] Add "rate the docs" plugin for feedback on docs (ray-project#26330) [Doc] Fix actor example (ray-project#26381) Set RAY_USAGE_STATS_EXTRA_TAGS for release tests (ray-project#26366) [Datasets] Update docs for drop_columns and fix typos (ray-project#26317) ...

Signed-off-by: Stefan van der Kleij <s.vanderkleij@viroteq.com>

gjoliver requested review from sven1977, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla, kouroshHakha and krfricke as code owners July 1, 2022 07:08

sven1977 reviewed Jul 1, 2022

View reviewed changes

sven1977 approved these changes Jul 1, 2022

View reviewed changes

gjoliver force-pushed the connector-3 branch from 9137df7 to c1ef350 Compare July 8, 2022 08:49

Jun Gong added 5 commits July 8, 2022 13:57

[RLlib] Checkpoint and restore connectors.

4288bc6

Plus a couple of examples showing the usage of connector enabled policies.

Bug fixes and self-play connector example.

23b5bc5

address comments

644d6d4

lint

f37816b

fix ci.

054ae87

gjoliver force-pushed the connector-3 branch from c1ef350 to 054ae87 Compare July 8, 2022 20:57

richardliaw merged commit 0c469e4 into ray-project:master Jul 9, 2022

Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022

[RLlib] Checkpoint and restore connectors. (ray-project#26253)

38b9b18

Signed-off-by: Stefan van der Kleij <s.vanderkleij@viroteq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Checkpoint and restore connectors. #26253

[RLlib] Checkpoint and restore connectors. #26253

gjoliver commented Jul 1, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022 •

edited

Loading

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 Jul 1, 2022

gjoliver Jul 7, 2022

sven1977 left a comment

gjoliver commented Jul 7, 2022

		@@ -0,0 +1,118 @@
		"""This example script shows how to load a connector enabled policy,
		and adapt/use it with a different version of environment.

		@@ -0,0 +1,54 @@
		"""This example script shows how to load a connector enabled policy,
		and adapt and use it with a different version of environment.

[RLlib] Checkpoint and restore connectors. #26253

[RLlib] Checkpoint and restore connectors. #26253

Conversation

gjoliver commented Jul 1, 2022

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sven1977 Jul 1, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sven1977 left a comment

Choose a reason for hiding this comment

gjoliver commented Jul 7, 2022

sven1977 Jul 1, 2022 •

edited

Loading