Skip to content

Add GitLab CI to stress test upper bound of the endpoint measurement #138

@nvzhihanj

Description

@nvzhihanj

Develop the proper e2e CI test to make sure:

  1. the inference endpoint measurement matches the server capality (the server is not bottlenecked by the client)
  2. Inference endpoint client is not bottlenecked by any of the components (the client is not bottlenecked by the components)

Metadata

Metadata

Assignees

Labels

area: core-engineLoad generator, scheduler, async utilspriority: P0Critical — blocks release or userstype: choreMaintenance, deps, CI, tooling

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions