Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ray.experimental.serve Module #4095

Merged
merged 29 commits into from
Mar 9, 2019
Merged

Conversation

simon-mo
Copy link
Contributor

This PR proposes a new module name ray.serve

ray.serve is a module for publishing your actors to interact with outside world. It utilizes ray to horizontally scale your actors.

Architecture

In the following illustration, call chain goes from top to bottom.
Each box is one or more replicated ray actors.

            +-------------------+     +-----------------+   +------------+
Frontend     |   HTTP Frontend   |     |    Arrow RPC    |   |    ...     |
 Tier       |                   |     |                 |   |            |
            +-------------------+     +-----------------+   +------------+
             +------------------------------------------------------------+
                  +--------------------+        +-------------------+
Router           |   Default Router   |        |   Deadline Aware  |
 Tier            |                    |        |      Router       |
                 +--------------------+        +-------------------+
             +------------------------------------------------------------+
                 +----------------+   +--------------+    +-------------+
Managed         |  Managed Actor |   |     ...      |    |     ...     |
Actor           |    Replica     |   |              |    |             |
Tier            +----------------+   +--------------+    +-------------+

Frontend Tier

The frontend tier is repsonsible for interface with the world. Currently ray.serve will provide
implementation for HTTP Frontend and a zeromq high performance frontend.

Router Tier

The router tier receives calls from frontend and route them to the managed actors. Routers both route and queue incoming queries. ray.serve has native support for (micro-)batching queries.

In addition, we implemented a deadline aware routers that will put high priority queries in the front
of the queue so they will be delivered first.

Managed Actor Tier

Managed actors will be managed by routers. These actors can contains arbitrary methods. Methods in the actors class are assumed to be able to take into a batch of input at a time. If this cannot be assumed, you can use the @single_input decorator, ray.serve will run your method in a for loop working on the micro-batch.

cc @atumanov @robertnishihara

P.S.
This is a re-implementation of an earlier prototype

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12107/
Test FAILed.

Copy link
Collaborator

@robertnishihara robertnishihara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff! Left some comments/questions.

@@ -0,0 +1,63 @@
# Ray Serve Module
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use rst instead of markdown

@@ -0,0 +1,6 @@
format:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should be removed

@@ -0,0 +1,3 @@
.vscode
__pycache__
.benchmarks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put these in the top level gitignore

@@ -0,0 +1,12 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

at the top of all Python files


@pytest.fixture(scope="module")
def router():
ray.init()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need ray.shutdown() to run in the teardown


@ray.remote
class DeadlineAwareRouter:
def __init__(self, router_name):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class doc string and method doc strings also

return func


class RayServeMixin:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class doc string

also, can you explain what the name means?



@ray.remote
class HTTPFrontendActor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class doc string

import ray


def start_router(router_class, router_name):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc string



@total_ordering
class SingleQuery:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class doc string

@jovany-wang
Copy link
Contributor

jovany-wang commented Feb 20, 2019

My concern is that the python http server impemented with Starlette may not meet the perf requirements for such real-time scenario.

@simon-mo
Copy link
Contributor Author

simon-mo commented Feb 20, 2019 via email

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12160/
Test FAILed.

@jovany-wang
Copy link
Contributor

@simon-mo Thanks for your reply.
I'd like to test the perf of uvicorn and some other C++ http framework.

@simon-mo simon-mo changed the title [WIP] ray.serve Module ray.experimental.serve Module Mar 6, 2019
@simon-mo simon-mo marked this pull request as ready for review March 6, 2019 20:54
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12623/
Test FAILed.

@@ -0,0 +1,63 @@
# Ray Serve Module
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is redundant.

@robertnishihara robertnishihara self-assigned this Mar 6, 2019
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12622/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12628/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12629/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12630/
Test FAILed.

@robertnishihara
Copy link
Collaborator

When I run python -m pytest -v -s python/ray/experimental/serve/tests/, the test frontend/test_default_app.py::test_http_basic seems to hang. It succeeds when I run the test individually. Any idea about this?

@robertnishihara
Copy link
Collaborator

I also see this failure

python/ray/experimental/serve/tests/router/test_deadline_router.py::test_deadline_priority FAILED

================================================== FAILURES ==================================================
___________________________________________ test_deadline_priority ___________________________________________

router = Actor(DeadlineAwareRouter, cc9b8191335004e9e1ef56e110e4b5fbaf9f3dc9), now = 405663.805075376

    def test_deadline_priority(router: DeadlineAwareRouter, now: float):
        # first sleep 2 seconds
        first = unwrap(router.call.remote("SleepCounter", 2, now + 1))
    
        # then send a request to with deadline farther away
        second = unwrap(router.call.remote("SleepCounter", 0, now + 10))
    
        # and a request with sooner deadline
        third = unwrap(router.call.remote("SleepCounter", 0, now + 1))
    
        id_1, id_2, id_3 = ray.get([first, second, third])
    
>       assert id_1 < id_3 < id_2
E       assert 3 < 2

python/ray/experimental/serve/tests/router/test_deadline_router.py:88: AssertionError
===================================== 1 failed, 4 passed in 7.15 seconds =====================================

@simon-mo
Copy link
Contributor Author

simon-mo commented Mar 8, 2019

@robertnishihara can we just call the tests separately? I have been debugging this for some time now and still can't fix it somehow, the only clue it got is that:

2019-03-08 00:49:58,104	WARNING actor.py:652 -- Actor is garbage collected in the wrong driver. Actor id = ActorID(2cd834f87ff272c3a9a34ff74fe83d33ef001877), class name = ScalerAdder.

The ordering issue was due to oversubscribtion, should be fixed now.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12695/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12698/
Test FAILed.

Copy link
Collaborator

@robertnishihara robertnishihara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @simon-mo, this looks great! I'll merge this once the CI passes.

As discussed offline, we should treat uvicorn and starlette as placeholders for the time being and not make any architectural decisions that tie us to them.

@robertnishihara
Copy link
Collaborator

We should probably also run these tests in our CI, but that will require upgrading to 3.6 since starlette requires 3.6. That can be done in a follow up PR.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/12708/
Test FAILed.

@robertnishihara robertnishihara merged commit 3064fad into ray-project:master Mar 9, 2019
@pcmoritz
Copy link
Contributor

pcmoritz commented Mar 9, 2019

Can we rename ray.serve to ray.serving in a future PR? That seems more natural, what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants