[xray] Adds a driver table. #2289

elibol · 2018-06-21T22:30:33Z

This PR adds a driver table for the new GCS, which enables cleanup functionality associated with monitoring driver death.

Some testing in monitor_test.py is restored, but redis sharding for xray is needed to enable remaining tests.

robertnishihara · 2018-06-21T23:21:34Z

python/ray/monitor.py

@@ -16,6 +16,8 @@
 from ray.core.generated.DriverTableMessage import DriverTableMessage
 from ray.core.generated.GcsTableEntry import GcsTableEntry
 from ray.core.generated.HeartbeatTableData import HeartbeatTableData
+from ray.core.generated.DriverTableData import DriverTableData
+from ray.core.generated.ray.protocol.Task import Task


this should be in ray.gcs_utils

robertnishihara · 2018-06-21T23:21:46Z

python/ray/monitor.py

 # common/redis_module/ray_redis_module.cc
 OBJECT_INFO_PREFIX = b"OI:"
 OBJECT_LOCATION_PREFIX = b"OL:"
 TASK_TABLE_PREFIX = b"TT:"
 DB_CLIENT_PREFIX = b"CL:"
 DB_CLIENT_TABLE_NAME = b"db_clients"
+XRAY_TASK_TABLE_PREFIX = b"TASK:"
+XRAY_OBJECT_TABLE_PREFIX = b"OBJECT:"


these things are already in ray.gcs_utils

robertnishihara · 2018-06-21T23:23:57Z

python/ray/monitor.py

+
+        task_table_infos = {}  # task id -> TaskInfo
+        for key in redis.scan_iter(match=XRAY_TASK_TABLE_PREFIX + b"*"):
+            entry = redis.get(key)


@concretevitamin, this will interact with automatic background flushing, right?

We should include a TODO here to address that (e.g., what if key gets flushed in the background while this is happening).

Right. I think this just needs a simple catch to test whether redis.get(key) returns None.

robertnishihara · 2018-06-21T23:24:22Z

python/ray/monitor.py

+        task_table_infos = {}  # task id -> TaskInfo
+        for key in redis.scan_iter(match=XRAY_TASK_TABLE_PREFIX + b"*"):
+            entry = redis.get(key)
+            task = Task.GetRootAsTask(entry, 0)


Can you actually parse it this way? Don't you need to parse it as a GcsTableEntry first?

robertnishihara · 2018-06-21T23:26:03Z

python/ray/monitor.py

+                task_info.Returns(i) for i in range(task_info.ReturnsLength())
+            ])
+
+        # TODO: Remove only objects in GCS that originated from this driver.


Are you saying we don't have enough information in the GCS to flush objects created by ray.put? From the object ID, we can determine the task ID, and from that we can determine the Driver ID, right? So I think we should be able to flush them already.

Also, let's make sure that we're testing cleanup for ray.put objects.

The redis entry for objects do not contain the task ID.

I think testing cleanup for ray.put is a substantial amount of work and may make sense to do as a separate PR.

What I mean is that the object ID itself tells you the task ID, right?

ray/src/ray/id.cc

Lines 166 to 174 in aa42331

const TaskID ComputeTaskId(const ObjectID &object_id) {

TaskID task_id = object_id;

int64_t *first_bytes = reinterpret_cast<int64_t *>(&task_id);

// Zero out the lowest kObjectIdIndexSize bits of the first byte of the

// object ID.

uint64_t bitmask = static_cast<uint64_t>(-1) << kObjectIdIndexSize;

*first_bytes = *first_bytes & (bitmask);

return task_id;

}

Excellent point. I was not aware of this method.

robertnishihara · 2018-06-21T23:29:29Z

src/ray/gcs/format/gcs.fbs

+  driver_id: string;
+  // Whether it's dead.
+  is_dead: bool;
+}


robertnishihara · 2018-06-21T23:30:07Z

src/ray/gcs/tables.h

+  };
+  virtual ~DriverTable() {}
+
+  Status AppendDriverData(UniqueID driver_id, bool is_dead) {


Should be JobID instead of UniqueID

robertnishihara · 2018-06-21T23:31:07Z

src/ray/raylet/node_manager.cc

+    HandleDriverTableUpdate(client_id, driver_data);
+  };
+  RAY_RETURN_NOT_OK(gcs_client_->driver_table().Subscribe(
+      UniqueID::nil(), UniqueID::nil(), driver_table_handler, nullptr));


The first one should be JobID::nil()

robertnishihara · 2018-06-21T23:31:57Z

test/monitor_test.py

@@ -44,6 +44,8 @@ def Driver(success):
            # Two new objects.
            ray.get(ray.put(1111))
            ray.get(ray.put(1111))
+            # Test is flaky without calls to sleep.


What are the sleeps needed for? Can we get rid of them?

The test sometimes fails without the wait in place.

Why does it fail, does this mean the cleanup implementation is buggy?

ray.get completes before information about the object's location is propagated to the GCS, so the count returned by StateSummary() immediately after a ray.get is incorrect.

To compare: If the object was remote, then ray.get would only complete after data propagates to the GCS: The associated pull for ray.get depends on information in the GCS.

Why weren't the sleeps necessary before?

Anyway, if the tests are flaky without the sleeps, then they will probably continue to be flaky even with the sleeps, so we should use a more robust approach (e.g., loop until the information is present (with a timeout that raises an exception)).

robertnishihara · 2018-06-21T23:32:57Z

test/monitor_test.py

+    def testTaskCleanupOnDriverExitSingleRedisShard(self):
+        self._testCleanupOnDriverExit(num_redis_shards=1, test_object_cleanup=False)
+
+    @unittest.skipIf(


I think we should be able to re-enable this.

I believe re-enabling this entails a substantial amount of work. It might make sense to re-enable as a separate PR.

robertnishihara · 2018-06-21T23:34:06Z

test/monitor_test.py

@@ -79,23 +83,43 @@ def f():
        ray.init(redis_address=redis_address)
        # The assertion below can fail if the monitor is too slow to clean up
        # the global state.
-        self.assertEqual((0, 1), StateSummary()[:2])
+
+        if test_object_cleanup:


Can you explain this if statement?

AmplabJenkins · 2018-06-21T23:41:41Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6176/
Test PASSed.

AmplabJenkins · 2018-06-22T00:31:21Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6177/
Test PASSed.

AmplabJenkins · 2018-06-22T00:35:06Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6178/
Test PASSed.

AmplabJenkins · 2018-06-22T01:03:33Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6179/
Test PASSed.

AmplabJenkins · 2018-06-22T01:07:25Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6183/
Test FAILed.

AmplabJenkins · 2018-06-22T01:13:54Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6180/
Test PASSed.

AmplabJenkins · 2018-06-22T01:33:19Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6182/
Test PASSed.

AmplabJenkins · 2018-06-22T08:22:41Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6186/
Test PASSed.

AmplabJenkins · 2018-06-22T08:30:51Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6187/
Test PASSed.

AmplabJenkins · 2018-06-25T22:18:32Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6263/
Test PASSed.

AmplabJenkins · 2018-06-25T22:51:12Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6265/
Test FAILed.

AmplabJenkins · 2018-07-02T22:40:18Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6428/
Test FAILed.

AmplabJenkins · 2018-07-26T20:51:27Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6879/
Test FAILed.

AmplabJenkins · 2018-07-26T21:22:59Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6882/
Test FAILed.

AmplabJenkins · 2018-07-26T21:26:40Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/6881/
Test PASSed.

robertnishihara

Thanks @elibol, this looks pretty good. I left some questions/comments. @eric-jj want to take a look also?

robertnishihara · 2018-07-28T03:34:19Z

python/ray/experimental/state.py

@@ -168,7 +168,7 @@ def _keys(self, pattern):
        """
        result = []
        for client in self.redis_clients:
-            result.extend(client.keys(pattern))
+            result.extend([i for i in client.scan_iter(match=pattern)])


Is the i for i in is necessary? If so, then maybe just use result.extend(list(client.scan_iter(match=pattern)))

robertnishihara · 2018-07-28T03:40:27Z

python/ray/monitor.py

+        for task_id_hex in task_table_objects:
+            if len(task_table_objects[task_id_hex]) == 0:
+                continue
+            task_table_object = task_table_objects[task_id_hex][0]['TaskSpec']


Minor, but we've been pretty consistent about using " everywhere instead of ' for strings, so let's stick with that (this applies to a number of places in this PR).

robertnishihara · 2018-07-28T03:41:01Z

src/ray/gcs/tables.h

+  };
+  virtual ~DriverTable() {}
+
+  Status AppendDriverData(JobID driver_id, bool is_dead) {


JobID -> const JobID &

robertnishihara · 2018-07-28T03:41:19Z

src/ray/gcs/tables.h

+  };
+  virtual ~DriverTable() {}
+
+  Status AppendDriverData(JobID driver_id, bool is_dead) {


Add doxygen documentation

robertnishihara · 2018-07-28T03:41:40Z

src/ray/gcs/tables.h

+                    RAY_LOG(DEBUG) << "Driver entry added callback";
+                  });
+  }
+};


Maybe put the implementation in the .cc file?

robertnishihara · 2018-07-28T04:10:43Z

python/ray/monitor.py

+        # Get the list of objects returned by driver tasks.
+        driver_object_ids = set()
+        for task_info in task_table_infos.values():
+            driver_object_ids |= {


what is |=?

| is the union operator of set.

robertnishihara · 2018-07-28T04:11:41Z

python/ray/monitor.py

+        # Also record all the ray.put()'d objects.
+        all_put_objects = []
+        object_table_objects = self.state.object_table()
+        for object_id in object_table_objects:


loop over keys and values to avoid unnecessary indexing

robertnishihara · 2018-07-28T04:11:53Z

python/ray/monitor.py

+        object_table_objects = self.state.object_table()
+        for object_id in object_table_objects:
+            if len(object_table_objects[object_id]) == 0:
+                continue


can probably assert that len > 0

robertnishihara · 2018-07-28T04:12:25Z

python/ray/monitor.py

+                continue
+            object_id_bin = object_id.id()
+            task_id_bin = ray.local_scheduler.compute_task_id(object_id).id()
+            all_put_objects.append((object_id_bin, task_id_bin))


this seems to be all objects, not just all put objects, right?

robertnishihara · 2018-07-28T04:13:09Z

python/ray/monitor.py

+
+        # Keep objects from relevant tasks.
+        driver_task_ids_set = set(driver_task_id_bins)
+        for object_id, task_id in all_put_objects:


If this actually covers all objects and not just put objects, then maybe we don't need to loop through the task return values above, is that right?

AmplabJenkins · 2018-08-01T21:48:32Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7107/
Test FAILed.

elibol · 2018-08-08T19:08:12Z

jenkins, retest this please

AmplabJenkins · 2018-08-08T19:45:29Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7339/
Test FAILed.

elibol · 2018-08-08T20:00:33Z

jenkins, retest this please

robertnishihara · 2018-08-08T19:50:34Z

python/ray/monitor.py

@@ -100,7 +100,7 @@ def __init__(self, redis_address, redis_port, autoscaling_config):
                    ignore_subscribe_messages=True)
                self.shard_subscribe_clients.append(subscribe_client)
        else:
-            # We don't need to subscribe to the shards in legacy Ray.
+            # We don`t need to subscribe to the shards in legacy Ray.


robertnishihara · 2018-08-08T20:17:15Z

python/ray/monitor.py

+            assert len(object_table_object) > 0
+            object_id_bin = object_id.id()
+            task_id_bin = ray.local_scheduler.compute_task_id(object_id).id()
+            all_objects.append((object_id_bin, task_id_bin))


Instead of constructing this all_objects list, we can probably just do

if task_id_bin in driver_task_id_bins: driver_object_id_bins.add(object_id)

here, right?

robertnishihara · 2018-08-08T20:19:24Z

src/ray/gcs/tables.h

+  ///
+  /// \param driver_id The driver id.
+  /// \param is_dead Whether the driver is dead.
+  /// \return The return status of Log.Append.


Minor, but I'd change this to just say, The return status. The interface documentation shouldn't refer to implementation details.

robertnishihara · 2018-08-08T20:19:51Z

src/ray/object_manager/object_manager.h

@@ -267,6 +267,10 @@ class ObjectManager : public ObjectManagerInterface {
  /// Register object remove with directory.
  void NotifyDirectoryObjectDeleted(const ObjectID &object_id);

+  /// Handle any push requests that were made before an object was available.
+  /// This is invoked when an "object added" notification is received from the store.
+  void HandleUnfulfilledPushRequests(const ObjectInfoT &object_info);


@elibol can you answer the above question?

AmplabJenkins · 2018-08-08T20:39:45Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7340/
Test FAILed.

robertnishihara · 2018-08-08T21:53:54Z

Looks good to me, I'll merge this once the tests pass.

AmplabJenkins · 2018-08-08T22:16:21Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7348/
Test PASSed.

robertnishihara · 2018-08-09T06:41:59Z

Nice work @elibol!

zhijunfu · 2018-08-30T15:48:36Z

@elibol This is nice. I'm working on job management on ray, and @robertnishihara pointed that I could leverage this for job cancellation.

I saw the code for doing actually cleanup at driver death is currently marked in TODO

Wondering if you have plans to finish this part sometime soon? Or if you're busy with other stuff, maybe I could take some time to work on this?

elibol · 2018-08-30T16:00:00Z

@zhijunfu that sounds great. Please do have a look into it.

robertnishihara · 2018-08-30T16:10:40Z

Thanks @zhijunfu, could you share an outline of what you plan on implementing?

zhijunfu · 2018-08-31T07:06:33Z

@elibol Yes will look into it. thanks.

zhijunfu · 2018-08-31T07:08:14Z

@robertnishihara Sure. I just raised an issue here with a proposal doc.

robertnishihara · 2018-09-02T22:09:55Z

@zhijunfu the driver cleanup should be implemented separately (and before) implementing the job submission, right?

zhijunfu · 2018-09-03T02:56:42Z

@robertnishihara yes, the cleanup code is mostly done. Will do some more testing and then submit it to internal review, will send a PR after that:)

elibol added 2 commits June 21, 2018 15:21

Add driver table. Partially enable monitor tests.

d364378

Merge branch 'master' into add_xray_driver_table

586723d

robertnishihara reviewed Jun 21, 2018

View reviewed changes

enable test.

d1aa230

robertnishihara reviewed Jun 21, 2018

View reviewed changes

elibol added 3 commits June 21, 2018 16:58

Incorporate Robert's feedback.

8ddb1b6

python linting

207080c

missing comma.

ec1bdc6

remove waits.

20dc675

elibol added 2 commits June 22, 2018 00:16

compute task_id and is_put for xray objects.

377426c

python linting.

8f20b06

elibol added 2 commits June 25, 2018 14:15

fix OM test.

85f5767

python fix.

818b295

handle interaction with auto flushing

406a9ad

elibol force-pushed the add_xray_driver_table branch from f0bc4f4 to 406a9ad Compare July 2, 2018 22:07

elibol added 3 commits July 2, 2018 15:33

Merge branch 'master' into add_xray_driver_table

e5112ff

use first entry in state array.

c33c9dc

mac linting

97dc027

robertnishihara reviewed Jul 28, 2018

View reviewed changes

elibol added 3 commits August 1, 2018 12:27

Merge branch 'master' into add_xray_driver_table

1db9c08

Robert's feedback.

c8b3e02

Update ignore for plasma/format directory.

5820a53

robertnishihara reviewed Aug 8, 2018

View reviewed changes

Merge branch 'master' into add_xray_driver_table

2d25067

Robert's feedback.

9abecd7

robertnishihara approved these changes Aug 8, 2018

View reviewed changes

robertnishihara merged commit 8ae8218 into ray-project:master Aug 9, 2018

robertnishihara mentioned this pull request Aug 9, 2018

[xray] Implementing Gcs sharding #2409

Merged

robertnishihara mentioned this pull request Sep 1, 2018

Add test (currently skipped) that drivers release resources when exiting. #2797

Merged

robertnishihara mentioned this pull request Sep 2, 2018

[xray] Cleanup when driver exits. #2809

Closed

6 tasks

	const TaskID ComputeTaskId(const ObjectID &object_id) {
	TaskID task_id = object_id;
	int64_t first_bytes = reinterpret_cast<int64_t >(&task_id);
	// Zero out the lowest kObjectIdIndexSize bits of the first byte of the
	// object ID.
	uint64_t bitmask = static_cast<uint64_t>(-1) << kObjectIdIndexSize;
	first_bytes = first_bytes & (bitmask);
	return task_id;
	}

[xray] Adds a driver table. #2289

[xray] Adds a driver table. #2289

Uh oh!

Conversation

elibol commented Jun 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elibol Jun 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Jun 21, 2018

Uh oh!

AmplabJenkins commented Jun 22, 2018

Uh oh!

AmplabJenkins commented Jun 22, 2018

Uh oh!

AmplabJenkins commented Jun 22, 2018

Uh oh!

AmplabJenkins commented Jun 22, 2018

Uh oh!

AmplabJenkins commented Jun 22, 2018

Uh oh!

AmplabJenkins commented Jun 22, 2018

Uh oh!

AmplabJenkins commented Jun 22, 2018

Uh oh!

AmplabJenkins commented Jun 22, 2018

Uh oh!

AmplabJenkins commented Jun 25, 2018

Uh oh!

AmplabJenkins commented Jun 25, 2018

Uh oh!

AmplabJenkins commented Jul 2, 2018

Uh oh!

AmplabJenkins commented Jul 26, 2018

Uh oh!

AmplabJenkins commented Jul 26, 2018

Uh oh!

AmplabJenkins commented Jul 26, 2018

elibol commented Jun 21, 2018 •

edited

Loading

elibol Jun 21, 2018 •

edited

Loading