Adding basic support for a user-interpretable resource label #761

atumanov · 2017-07-20T22:12:42Z

This PR provides the ability to configure an arbitrary resource per local scheduler and lets tasks request it. It natively supports infinite capacity out of the box.

This is an experimental first pass at addressing #695. There will be API changes down the road.

AmplabJenkins · 2017-07-26T06:55:13Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-07-26T06:55:13Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1422/
Test PASSed.

AmplabJenkins · 2017-07-26T10:37:42Z

Merged build finished. Test FAILed.

AmplabJenkins · 2017-07-26T10:37:43Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1423/
Test FAILed.

AmplabJenkins · 2017-07-26T11:05:33Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-07-26T11:05:34Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1424/
Test PASSed.

robertnishihara · 2017-07-26T19:55:58Z

Want to add some tests showing how to use this?

atumanov · 2017-07-26T20:18:30Z

yeah, I'll add some unit tests to exercise this. I've been testing it like this:

#!/usr/bin/env python
import ray
import time

@ray.remote(num_uirs=1)
def f():
  time.sleep(10)
  return 1

@ray.remote(num_uirs=1)
def g():
  return 2

ray.init(num_uirs=1)

oid = f.remote(); oids = [g.remote() for _ in range(10)]
t1 = time.time(); results = ray.get(oids); t2 = time.time()
print t2-t1, results

AmplabJenkins · 2017-08-08T04:02:15Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-08T04:02:16Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1568/
Test PASSed.

…ied on cmd line

…rized resource accounting

AmplabJenkins · 2017-08-08T05:32:12Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-08T05:32:12Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1571/
Test PASSed.

AmplabJenkins · 2017-08-08T05:57:40Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-08T05:57:41Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1572/
Test PASSed.

AmplabJenkins · 2017-08-08T06:12:27Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-08T06:12:28Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1573/
Test PASSed.

AmplabJenkins · 2017-08-08T06:37:30Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-08T06:37:31Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1574/
Test PASSed.

AmplabJenkins · 2017-08-08T07:06:08Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-08T07:06:08Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1575/
Test PASSed.

pcmoritz · 2017-08-08T08:34:51Z

python/ray/worker.py

@@ -1296,6 +1305,9 @@ def init(redis_address=None, node_ip_address=None, object_id_seed=None,
            be configured with.
        num_gpus (int): Number of gpus the user wishes all local schedulers to
            be configured with.
+        num_custom_resource (int): The quantity of a user-defined custom
+            resource that the local scheduler should be configured with. This
+            flag is highly unstable and should not be used.


instead of unstable, let's say "support for this will be removed" or "experimental"; unstable has more a connotation of not working reliably

"will be removed" sounds like that it has been deprecated. I hope it will not just be removed, but maybe matured into something else.

Maybe just saying "experimental feature subject to changes in the future". Kubernetes has this for GPU support: https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/

yeah, you are right, sounds good!

mitar · 2017-08-08T08:46:17Z

BTW, your GitHub tagline is "An experimental distributed execution engine". So this is then experimental experimental feature? ;-)

Maybe you should find a better tagline. Like "an awesome distributed execution engine". :-)

AmplabJenkins · 2017-08-08T09:02:03Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-08-08T09:02:04Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1577/
Test PASSed.

mitar · 2017-08-08T10:10:04Z

Thanks!

robertnishihara · 2017-08-08T17:16:59Z

This can be used as follows (for example).

Start three machines like this.

ray start --head --redis-port=6379

ray start --redis-address 172.31.10.143:6379 --num-custom-resource=10

ray start --redis-address 172.31.10.143:6379

Define a remote function that uses some of the "custom resource"

import ray

ray.init(redis_address="172.31.10.143:6379")

@ray.remote(num_custom_resource=1)
def f():
    import time
    time.sleep(0.01)
    return ray.services.get_node_ip_address()

print(set(ray.get([f.remote() for _ in range(1000)])))

The print statement should show that it only is scheduled on the second machine. Note that to start a machine with infinite "custom resource", you can use --num-custom-resource=-1.

atumanov force-pushed the uirlabel branch from 0b3f367 to 282486f Compare July 26, 2017 06:43

atumanov changed the title ~~[WIP] Adding basic support for a user-interpretable resource label~~ Adding basic support for a user-interpretable resource label Jul 26, 2017

robertnishihara force-pushed the uirlabel branch from ae9a294 to 17a805e Compare August 8, 2017 03:46

robertnishihara mentioned this pull request Aug 8, 2017

Scheduling based on labels #695

Closed

atumanov and others added 8 commits August 7, 2017 21:39

adding support for the user-interpretable label(UIR)

b57178f

more plumbing for num_uirs further upstream; set to infty when specif…

df863c4

…ied on cmd line

pass default num_uirs for actors; update GlobalStateAPI

1929304

support num_uirs in ray.init()

e7bed16

local scheduler resource accounting: support num_uirs; prep for vecto…

4824b3b

…rized resource accounting

global scheduler test updated

feb40f9

Fix bug introduced by rebase.

fff863e

Rename UIR -> CustomResource and add test.

cc96ecc

robertnishihara force-pushed the uirlabel branch from 17a805e to cc96ecc Compare August 8, 2017 05:14

robertnishihara added 2 commits August 7, 2017 22:40

Small changes and use constexpr instead of macros.

73acb67

Linting and some renaming.

b792ee7

robertnishihara added 2 commits August 7, 2017 23:14

Reorder some code.

64c54f6

Remove cpus_in_use and fix bug.

6573da2

Add another test and make a small change.

bfed77e

pcmoritz reviewed Aug 8, 2017

View reviewed changes

Rephrase documentation about feature stability.

50abe2e

pcmoritz merged commit fc885bd into ray-project:master Aug 8, 2017

pcmoritz deleted the uirlabel branch August 8, 2017 09:54

richardliaw mentioned this pull request Sep 10, 2017

Specifying Actor placement? #959

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding basic support for a user-interpretable resource label #761

Adding basic support for a user-interpretable resource label #761

atumanov commented Jul 20, 2017 •

edited by robertnishihara

Loading

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

robertnishihara commented Jul 26, 2017

atumanov commented Jul 26, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

pcmoritz Aug 8, 2017

mitar Aug 8, 2017

pcmoritz Aug 8, 2017

mitar commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

mitar commented Aug 8, 2017

robertnishihara commented Aug 8, 2017

Adding basic support for a user-interpretable resource label #761

Adding basic support for a user-interpretable resource label #761

Conversation

atumanov commented Jul 20, 2017 • edited by robertnishihara Loading

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

AmplabJenkins commented Jul 26, 2017

robertnishihara commented Jul 26, 2017

atumanov commented Jul 26, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

pcmoritz Aug 8, 2017

Choose a reason for hiding this comment

mitar Aug 8, 2017

Choose a reason for hiding this comment

pcmoritz Aug 8, 2017

Choose a reason for hiding this comment

mitar commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

AmplabJenkins commented Aug 8, 2017

mitar commented Aug 8, 2017

robertnishihara commented Aug 8, 2017

atumanov commented Jul 20, 2017 •

edited by robertnishihara

Loading