`ray.experimental.serve` Module #4095

simon-mo · 2019-02-19T08:54:43Z

This PR proposes a new module name ray.serve

ray.serve is a module for publishing your actors to interact with outside world. It utilizes ray to horizontally scale your actors.

Architecture

In the following illustration, call chain goes from top to bottom.
Each box is one or more replicated ray actors.

            +-------------------+     +-----------------+   +------------+
Frontend     |   HTTP Frontend   |     |    Arrow RPC    |   |    ...     |
 Tier       |                   |     |                 |   |            |
            +-------------------+     +-----------------+   +------------+
             +------------------------------------------------------------+
                  +--------------------+        +-------------------+
Router           |   Default Router   |        |   Deadline Aware  |
 Tier            |                    |        |      Router       |
                 +--------------------+        +-------------------+
             +------------------------------------------------------------+
                 +----------------+   +--------------+    +-------------+
Managed         |  Managed Actor |   |     ...      |    |     ...     |
Actor           |    Replica     |   |              |    |             |
Tier            +----------------+   +--------------+    +-------------+

Frontend Tier

The frontend tier is repsonsible for interface with the world. Currently ray.serve will provide
implementation for HTTP Frontend and a zeromq high performance frontend.

Router Tier

The router tier receives calls from frontend and route them to the managed actors. Routers both route and queue incoming queries. ray.serve has native support for (micro-)batching queries.

In addition, we implemented a deadline aware routers that will put high priority queries in the front
of the queue so they will be delivered first.

Managed Actor Tier

Managed actors will be managed by routers. These actors can contains arbitrary methods. Methods in the actors class are assumed to be able to take into a batch of input at a time. If this cannot be assumed, you can use the @single_input decorator, ray.serve will run your method in a for loop working on the micro-batch.

cc @atumanov @robertnishihara

P.S.
This is a re-implementation of an earlier prototype

AmplabJenkins · 2019-02-19T17:43:06Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12107/
Test FAILed.

robertnishihara

Awesome stuff! Left some comments/questions.

robertnishihara · 2019-02-20T00:35:14Z

python/ray/serve/README.md

@@ -0,0 +1,63 @@
+# Ray Serve Module


Use rst instead of markdown

robertnishihara · 2019-02-20T00:35:45Z

python/ray/serve/Makefile

@@ -0,0 +1,6 @@
+format:


this file should be removed

robertnishihara · 2019-02-20T00:36:09Z

python/ray/serve/.gitignore

@@ -0,0 +1,3 @@
+.vscode
+__pycache__
+.benchmarks


Put these in the top level gitignore

robertnishihara · 2019-02-20T00:39:07Z

python/ray/serve/__init__.py

@@ -0,0 +1,12 @@
+"""


from __future__ import absolute_import from __future__ import division from __future__ import print_function

at the top of all Python files

robertnishihara · 2019-02-20T00:41:57Z

python/ray/serve/tests/router/test_deadline_router.py

+
+@pytest.fixture(scope="module")
+def router():
+    ray.init()


we need ray.shutdown() to run in the teardown

robertnishihara · 2019-02-20T04:43:17Z

python/ray/serve/router/routers.py

+
+@ray.remote
+class DeadlineAwareRouter:
+    def __init__(self, router_name):


class doc string and method doc strings also

robertnishihara · 2019-02-20T04:43:29Z

python/ray/serve/mixin.py

+    return func
+
+
+class RayServeMixin:


class doc string

also, can you explain what the name means?

robertnishihara · 2019-02-20T04:43:36Z

python/ray/serve/frontend/HTTPFrontend.py

+
+
+@ray.remote
+class HTTPFrontendActor:


class doc string

robertnishihara · 2019-02-20T04:44:07Z

python/ray/serve/router/__init__.py

+import ray
+
+
+def start_router(router_class, router_name):


robertnishihara · 2019-02-20T04:44:13Z

python/ray/serve/router/routers.py

+
+
+@total_ordering
+class SingleQuery:


class doc string

jovany-wang · 2019-02-20T15:03:17Z

My concern is that the python http server impemented with Starlette may not meet the perf requirements for such real-time scenario.

simon-mo · 2019-02-20T18:11:43Z

Combined with uvicorn does have better performance than tornado. That’s why we are considering it. https://www.techempower.com/benchmarks/#section=data-r17&hw=ph&test=fortune&l=zijzen-1

…

On Wed, Feb 20, 2019 at 7:03 AM Wang Qing ***@***.***> wrote: My concern is that the python http server impemented by Starlette may not meet the perf requirements for such real-time scenario. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4095 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AUI_g5hRrfL7-qY3gTBAlG4av5g7RdNkks5vPWPHgaJpZM4bCeAW> .

AmplabJenkins · 2019-02-20T20:29:38Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12160/
Test FAILed.

jovany-wang · 2019-02-25T10:36:37Z

@simon-mo Thanks for your reply.
I'd like to test the perf of uvicorn and some other C++ http framework.

AmplabJenkins · 2019-03-06T21:02:35Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12623/
Test FAILed.

robertnishihara · 2019-03-06T21:08:43Z

python/ray/experimental/serve/README.md

@@ -0,0 +1,63 @@
+# Ray Serve Module


This file is redundant.

AmplabJenkins · 2019-03-06T23:55:28Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12622/
Test FAILed.

python/ray/experimental/serve/README.rst

AmplabJenkins · 2019-03-07T02:17:13Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12628/
Test FAILed.

AmplabJenkins · 2019-03-07T03:19:11Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12629/
Test FAILed.

AmplabJenkins · 2019-03-07T03:32:39Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12630/
Test FAILed.

robertnishihara · 2019-03-08T07:56:53Z

When I run python -m pytest -v -s python/ray/experimental/serve/tests/, the test frontend/test_default_app.py::test_http_basic seems to hang. It succeeds when I run the test individually. Any idea about this?

robertnishihara · 2019-03-08T07:57:57Z

I also see this failure

python/ray/experimental/serve/tests/router/test_deadline_router.py::test_deadline_priority FAILED

================================================== FAILURES ==================================================
___________________________________________ test_deadline_priority ___________________________________________

router = Actor(DeadlineAwareRouter, cc9b8191335004e9e1ef56e110e4b5fbaf9f3dc9), now = 405663.805075376

    def test_deadline_priority(router: DeadlineAwareRouter, now: float):
        # first sleep 2 seconds
        first = unwrap(router.call.remote("SleepCounter", 2, now + 1))
    
        # then send a request to with deadline farther away
        second = unwrap(router.call.remote("SleepCounter", 0, now + 10))
    
        # and a request with sooner deadline
        third = unwrap(router.call.remote("SleepCounter", 0, now + 1))
    
        id_1, id_2, id_3 = ray.get([first, second, third])
    
>       assert id_1 < id_3 < id_2
E       assert 3 < 2

python/ray/experimental/serve/tests/router/test_deadline_router.py:88: AssertionError
===================================== 1 failed, 4 passed in 7.15 seconds =====================================

simon-mo · 2019-03-08T08:53:54Z

@robertnishihara can we just call the tests separately? I have been debugging this for some time now and still can't fix it somehow, the only clue it got is that:

2019-03-08 00:49:58,104	WARNING actor.py:652 -- Actor is garbage collected in the wrong driver. Actor id = ActorID(2cd834f87ff272c3a9a34ff74fe83d33ef001877), class name = ScalerAdder.

The ordering issue was due to oversubscribtion, should be fixed now.

AmplabJenkins · 2019-03-08T10:52:32Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12695/
Test FAILed.

AmplabJenkins · 2019-03-08T11:56:53Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12698/
Test FAILed.

robertnishihara

Thanks @simon-mo, this looks great! I'll merge this once the CI passes.

As discussed offline, we should treat uvicorn and starlette as placeholders for the time being and not make any architectural decisions that tie us to them.

robertnishihara · 2019-03-08T20:27:09Z

We should probably also run these tests in our CI, but that will require upgrading to 3.6 since starlette requires 3.6. That can be done in a follow up PR.

AmplabJenkins · 2019-03-08T23:23:02Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12708/
Test FAILed.

pcmoritz · 2019-03-09T00:40:16Z

Can we rename ray.serve to ray.serving in a future PR? That seems more natural, what do you think?

simon-mo added 2 commits February 18, 2019 01:58

Start porting

8e21ea6

Start re-implement corvette

1fdcc4b

robertnishihara reviewed Feb 20, 2019

View reviewed changes

simon-mo added 2 commits February 20, 2019 10:49

Add more models to testing

c477686

Add more tests

a9e2cb5

Move to ray.experimental

7937e03

simon-mo changed the title ~~[WIP] ray.serve Module~~ ray.experimental.serve Module Mar 6, 2019

simon-mo marked this pull request as ready for review March 6, 2019 20:54

simon-mo added 4 commits March 6, 2019 12:55

Move gitignore to ray root

8615980

Remove makefile

ffc78fd

init_ray -> ray_start

75f3865

Address some comments

5166cfa

robertnishihara reviewed Mar 6, 2019

View reviewed changes

python/ray/experimental/serve/README.md Outdated

@@ -0,0 +1,63 @@

# Ray Serve Module

Copy link

Collaborator

robertnishihara Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is redundant.

robertnishihara self-assigned this Mar 6, 2019

simon-mo and others added 10 commits March 6, 2019 14:54

Start porting

21c1c9e

Start re-implement corvette

d139eeb

Add more models to testing

0d9c654

Add more tests

92d27c5

Move to ray.experimental

668b812

Move gitignore to ray root

92adeb8

Remove makefile

32952af

init_ray -> ray_start

46986c4

Address some comments

479ba4e

Linting.

28a8fc3

robertnishihara force-pushed the ray-serve branch from 5166cfa to 28a8fc3 Compare March 6, 2019 23:30

simon-mo added 3 commits March 6, 2019 15:58

Add http_frontend

9282706

Revert runtest

5ea89e8

Format code, remove README.md

5f03e75

robertnishihara reviewed Mar 7, 2019

View reviewed changes

python/ray/experimental/serve/README.rst Outdated Show resolved Hide resolved

simon-mo added 4 commits March 6, 2019 16:11

Remove trailing whitespace

d08a75e

Merge branch 'ray-serve' of github.com:simon-mo/ray into ray-serve

6898748

Add imports back

ca25589

Fix readme

410d529

Linting

ea62de8

fix test

ba73b76

Linting.

c08429e

robertnishihara approved these changes Mar 8, 2019

View reviewed changes

robertnishihara merged commit 3064fad into ray-project:master Mar 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ray.experimental.serve` Module #4095

`ray.experimental.serve` Module #4095

simon-mo commented Feb 19, 2019

AmplabJenkins commented Feb 19, 2019

robertnishihara left a comment

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

robertnishihara Feb 20, 2019

jovany-wang commented Feb 20, 2019 •

edited

Loading

simon-mo commented Feb 20, 2019 via email •

edited

Loading

AmplabJenkins commented Feb 20, 2019

jovany-wang commented Feb 25, 2019

AmplabJenkins commented Mar 6, 2019

robertnishihara Mar 6, 2019

AmplabJenkins commented Mar 6, 2019

AmplabJenkins commented Mar 7, 2019

AmplabJenkins commented Mar 7, 2019

AmplabJenkins commented Mar 7, 2019

robertnishihara commented Mar 8, 2019

robertnishihara commented Mar 8, 2019

simon-mo commented Mar 8, 2019

AmplabJenkins commented Mar 8, 2019

AmplabJenkins commented Mar 8, 2019

robertnishihara left a comment

robertnishihara commented Mar 8, 2019

AmplabJenkins commented Mar 8, 2019

pcmoritz commented Mar 9, 2019 •

edited

Loading

ray.experimental.serve Module #4095

ray.experimental.serve Module #4095

Conversation

simon-mo commented Feb 19, 2019

Architecture

Frontend Tier

Router Tier

Managed Actor Tier

AmplabJenkins commented Feb 19, 2019

robertnishihara left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jovany-wang commented Feb 20, 2019 • edited Loading

simon-mo commented Feb 20, 2019 via email • edited Loading

AmplabJenkins commented Feb 20, 2019

jovany-wang commented Feb 25, 2019

AmplabJenkins commented Mar 6, 2019

Choose a reason for hiding this comment

AmplabJenkins commented Mar 6, 2019

AmplabJenkins commented Mar 7, 2019

AmplabJenkins commented Mar 7, 2019

AmplabJenkins commented Mar 7, 2019

robertnishihara commented Mar 8, 2019

robertnishihara commented Mar 8, 2019

simon-mo commented Mar 8, 2019

AmplabJenkins commented Mar 8, 2019

AmplabJenkins commented Mar 8, 2019

robertnishihara left a comment

Choose a reason for hiding this comment

robertnishihara commented Mar 8, 2019

AmplabJenkins commented Mar 8, 2019

pcmoritz commented Mar 9, 2019 • edited Loading

`ray.experimental.serve` Module #4095

`ray.experimental.serve` Module #4095

jovany-wang commented Feb 20, 2019 •

edited

Loading

simon-mo commented Feb 20, 2019 via email •

edited

Loading

pcmoritz commented Mar 9, 2019 •

edited

Loading