Develop the proper e2e CI test to make sure:
- the inference endpoint measurement matches the server capality (the server is not bottlenecked by the client)
- Inference endpoint client is not bottlenecked by any of the components (the client is not bottlenecked by the components)
Develop the proper e2e CI test to make sure: