chore(http-client): cleanup types, improve coverage, remove orjson by viraatc · Pull Request #159 · mlcommons/endpoints

viraatc · 2026-03-09T21:38:10Z

What does this PR do?

continue #121

github-actions · 2026-03-09T21:38:19Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist · 2026-03-09T21:38:39Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the robustness, performance, and maintainability of the HTTP client. It primarily achieves this by migrating from orjson to msgspec.json for JSON operations, which streamlines the serialization process. Additionally, the core HTTP client has been refactored to provide a more intuitive synchronous interface with improved connection management and comprehensive test coverage for its response retrieval and shutdown mechanisms. These changes contribute to a more stable and efficient client architecture.

Highlights

Dependency Migration: Replaced the orjson library with msgspec.json across the codebase for JSON serialization and deserialization, aiming for improved performance and consistency.
HTTP Client Refactoring: The AsyncHttpEndpointClient was renamed to HTTPEndpointClient and its interface was made synchronous. New methods (poll, drain) were introduced for non-blocking response retrieval, and the shutdown method was made synchronous and idempotent.
Type System and Assertions Cleanup: Improved type hints for configuration parameters and worker components, removing Optional types where defaults are guaranteed. Redundant assert statements were removed, simplifying the code and relying on the updated type guarantees.
Connection Management Robustness: The connection pool's release method was made idempotent, preventing issues if a connection is released multiple times. Logic around connection handling in workers was also streamlined.
Enhanced Test Coverage: Added new integration tests for the HTTPEndpointClient to cover poll, recv, drain methods, and shutdown behavior, significantly improving the client's test coverage.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

pyproject.toml
- Removed 'orjson' from project dependencies.
src/inference_endpoint/endpoint_client/init.py
- Removed 'AsyncHttpEndpointClient' from exports.
- Updated module docstring to reflect the primary HTTP client implementation.
src/inference_endpoint/endpoint_client/config.py
- Updated type hints for 'adapter', 'accumulator', and 'worker_pool_transport' to reflect that they are always initialized in 'post_init'.
src/inference_endpoint/endpoint_client/http.py
- Made the 'release' method in 'ConnectionPool' idempotent.
- Updated the type hint for 'InFlightRequest.connection' to remove 'Optional' and added a type ignore.
src/inference_endpoint/endpoint_client/http_client.py
- Renamed 'AsyncHttpEndpointClient' to 'HTTPEndpointClient'.
- Introduced synchronous 'poll()' and 'drain()' methods for non-blocking response retrieval.
- Refactored 'shutdown()' to be synchronous and call an internal async shutdown method.
- Removed redundant 'assert' statements for configuration attributes.
- Removed the separate 'HTTPEndpointClient' class that provided a sync wrapper, as the main client is now synchronous.
src/inference_endpoint/endpoint_client/http_sample_issuer.py
- Removed redundant 'assert' statement for 'self.http_client.loop'.
src/inference_endpoint/endpoint_client/worker.py
- Updated type hints for internal HTTP and IPC components to remove 'Optional' and added type ignores.
- Removed redundant 'assert' statements for various initialized components.
- Added a more specific assertion message for 'event_logs_dir' when event recording is enabled.
- Streamlined request preparation and firing logic.
- Simplified connection release logic in response handling methods by removing explicit 'req.connection = None'.
src/inference_endpoint/evaluation/scoring.py
- Replaced 'orjson' imports with 'msgspec.json'.
- Updated JSON loading and decoding calls to use 'msgspec.json'.
- Modified exception handling to catch 'msgspec.DecodeError'.
src/inference_endpoint/load_generator/session.py
- Replaced 'orjson' imports with 'msgspec.json'.
- Updated JSON encoding and formatting for runtime settings and sample index map files.
src/inference_endpoint/metrics/recorder.py
- Replaced 'orjson' imports with 'msgspec.json'.
- Updated JSON encoding calls to use 'msgspec.json'.
- Modified exception handling to catch 'msgspec.EncodeError'.
src/inference_endpoint/metrics/reporter.py
- Replaced 'orjson' imports with 'msgspec.json'.
- Updated comments regarding numpy dtypes and 'msgspec'.
- Updated JSON encoding and formatting calls to use 'msgspec.json'.
- Modified exception handling to catch 'msgspec.DecodeError'.
src/inference_endpoint/openai/openai_adapter.py
- Removed 'orjson' import.
- Updated JSON encoding and decoding methods to use 'msgspec.json'.
src/inference_endpoint/utils/benchmark_httpclient.py
- Replaced 'AsyncHttpEndpointClient' with 'HTTPEndpointClient' in client creation.
- Simplified the client shutdown call.
tests/conftest.py
- Replaced 'orjson' imports with 'msgspec.json'.
- Updated JSON encoding calls to use 'msgspec.json'.
tests/integration/endpoint_client/test_http_client.py
- Added helper functions '_create_client' and '_make_query'.
- Introduced new test classes 'TestPoll', 'TestRecv', 'TestDrain', and 'TestShutdown' to enhance coverage of client response retrieval and shutdown mechanisms.
tests/unit/metrics/test_recorder.py
- Replaced 'orjson' imports with 'msgspec.json'.
- Updated JSON encoding and decoding calls to use 'msgspec.json'.
tests/unit/metrics/test_reporter.py
- Replaced 'orjson' imports with 'msgspec.json'.
- Updated JSON encoding calls to use 'msgspec.json'.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

viraatc added 3 commits March 9, 2026 14:26

drop orjson

cf3d101

update

e325f21

trigger CI

1afaa70

viraatc requested a review from a team as a code owner March 9, 2026 21:38

Copilot AI review requested due to automatic review settings March 9, 2026 21:38

github-actions bot requested review from arekay-nv and nvzhihanj March 9, 2026 21:38

viraatc mentioned this pull request Mar 9, 2026

chore(http-client): cleanup types, improve coverage, remove orjson #121

Closed

10 tasks

Copilot started reviewing on behalf of viraatc March 9, 2026 21:38 View session

arekay-nv approved these changes Mar 9, 2026

View reviewed changes

nv-alicheng approved these changes Mar 9, 2026

View reviewed changes

viraatc merged commit 0ceafb4 into main Mar 9, 2026
11 of 14 checks passed

github-actions bot locked and limited conversation to collaborators Mar 9, 2026

viraatc deleted the feat/viraatc-cleanup-asserts branch March 9, 2026 22:55

viraatc review requested due to automatic review settings March 23, 2026 22:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(http-client): cleanup types, improve coverage, remove orjson #159

chore(http-client): cleanup types, improve coverage, remove orjson #159
viraatc merged 3 commits intomainfrom
feat/viraatc-cleanup-asserts

viraatc commented Mar 9, 2026

Uh oh!

github-actions bot commented Mar 9, 2026

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

viraatc commented Mar 9, 2026

What does this PR do?

Uh oh!

github-actions bot commented Mar 9, 2026

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants