Skip to content

Add gRPC support to RealtimeAPI kind #1056

Closed
@RobertLucian

Description

@RobertLucian

Description

https://grpc.io/docs/guides/concepts/
https://blog.feathersjs.com/http-vs-websockets-a-performance-comparison-da2533f13a77

  1. For best performance, the protocol would be customizable in the API spec (REST vs gRPC). A field called protocol would be added to the predictor section.
  2. User is responsible for providing a protobuf file in the API spec as well. Cortex would automatically generate the server files from the protobuf.
  3. Both REST and gRPC cannot be served for the same API.
  4. Istio doesn't have to be changed - it's got native support for HTTP 2.0. We probably need to change the service mode of APIs to headless mode.
  5. The protobuf would only support a single method/service - predict. Its input/output is defined by the user. We don't provide a default protobuf.

Example of python predictor for gRPC:

# UserProvidedServicer provides an implementation of the methods of the RouteGuide service.
class PythonPredictor(user_pb2_grpc.UserServicer):
  def start(self, config):
    self.config = config
  def predict(self, request_iterator, context):
    prev_notes = []
    for new_note in request_iterator:
      for prev_note in prev_notes:
        if prev_note.location == new_note.location:
          yield prev_note
      prev_notes.append(new_note)

We don't necessarily need (or can) to ask the user to subclass the generated server files - we can wrap this class instead. This way, the class could look like

class PythonPredictor:
  def __init__(self, config):
    self.config = config
  def predict(self, request_iterator, context):
    prev_notes = []
    for new_note in request_iterator:
      for prev_note in prev_notes:
        if prev_note.location == new_note.location:
          yield prev_note
      prev_notes.append(new_note)

Creating a gRPC server is straightforward. We'll need multiple servers as dictated by processes_per_replica field.

def serve():
  server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
  user_pb2_grpc.add_UserServices_to_server(
      PythonPredictor(), server)
  server.add_insecure_port('[::]:50051')
  server.start()
  server.wait_for_termination()

Example API spec:

- name: iris-classifier
  kind: RealtimeAPI
  predictor:
    type: python
    path: predictor.py
    protocol: grpc/rest
    protobuf: predictor.proto
  compute:
    cpu: 0.2
    mem: 200M

To parse proto files in Go: https://github.com/tallstoat/pbparser

Motivation

Because it has lower latency and higher throughput than REST. It's already well known within the community:

Additional context

Maybe use something like Linkerd (maybe this isn't needed) for load-balancing gRPC requests:

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions