-
Notifications
You must be signed in to change notification settings - Fork 46
Rework Backend to Native HTTP Requests and Enhance API Compatibility & Performance #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…hey become serializable
…hey become serializable
…http/2 support. Additionally, add in support for multi modal requests (needs further enablement in the rest of the system for future TODO). Still needs testing and test fixes
24f753a
to
9e4bc84
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sjmonson! I think that's likely. Also, provided we're not seeing a severe regression, I would say let's focus on optimizing the perf with #96 |
Prior to the `openai_server` -> `openai_http` refactor (#91), we were using the `extra_query` parameter [in the OpenAI client](https://github.com/openai/openai-python/blob/fad098ffad7982a5150306a3d17f51ffef574f2e/src/openai/resources/models.py#L50) to send custom query parameters to the OpenAI server in requests made by guidellm. This PR adds that parameter to the new `OpenAIHTTPBackend`, making it possible to add custom query parameters that are included in every request sent to the server.
Summary
This PR restructures the backend of
guidellm
to utilize native HTTP requests withhttpx
and HTTP/2 standards, significantly improving performance and scalability. It aligns the backend interface closely with the OpenAI API specifications, enabling smoother integration paths for future expansions, including multi-backend support.Details
httpx
, enabling HTTP/2 support for optimized performance.base.py
,load_generator.py
.backend.py
,scheduler/backend_worker.py
, and structured response handling inresponse.py
.config.py
) to better support environment-based settings.Testing
Related Issues