Skip to content

chore(http-client): cleanup types, improve coverage, remove orjson #159

Merged
viraatc merged 3 commits intomainfrom
feat/viraatc-cleanup-asserts
Mar 9, 2026
Merged

chore(http-client): cleanup types, improve coverage, remove orjson #159
viraatc merged 3 commits intomainfrom
feat/viraatc-cleanup-asserts

Conversation

@viraatc
Copy link
Copy Markdown
Collaborator

@viraatc viraatc commented Mar 9, 2026

What does this PR do?

continue #121

@viraatc viraatc requested a review from a team as a code owner March 9, 2026 21:38
Copilot AI review requested due to automatic review settings March 9, 2026 21:38
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 9, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the robustness, performance, and maintainability of the HTTP client. It primarily achieves this by migrating from orjson to msgspec.json for JSON operations, which streamlines the serialization process. Additionally, the core HTTP client has been refactored to provide a more intuitive synchronous interface with improved connection management and comprehensive test coverage for its response retrieval and shutdown mechanisms. These changes contribute to a more stable and efficient client architecture.

Highlights

  • Dependency Migration: Replaced the orjson library with msgspec.json across the codebase for JSON serialization and deserialization, aiming for improved performance and consistency.
  • HTTP Client Refactoring: The AsyncHttpEndpointClient was renamed to HTTPEndpointClient and its interface was made synchronous. New methods (poll, drain) were introduced for non-blocking response retrieval, and the shutdown method was made synchronous and idempotent.
  • Type System and Assertions Cleanup: Improved type hints for configuration parameters and worker components, removing Optional types where defaults are guaranteed. Redundant assert statements were removed, simplifying the code and relying on the updated type guarantees.
  • Connection Management Robustness: The connection pool's release method was made idempotent, preventing issues if a connection is released multiple times. Logic around connection handling in workers was also streamlined.
  • Enhanced Test Coverage: Added new integration tests for the HTTPEndpointClient to cover poll, recv, drain methods, and shutdown behavior, significantly improving the client's test coverage.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pyproject.toml
    • Removed 'orjson' from project dependencies.
  • src/inference_endpoint/endpoint_client/init.py
    • Removed 'AsyncHttpEndpointClient' from exports.
    • Updated module docstring to reflect the primary HTTP client implementation.
  • src/inference_endpoint/endpoint_client/config.py
    • Updated type hints for 'adapter', 'accumulator', and 'worker_pool_transport' to reflect that they are always initialized in 'post_init'.
  • src/inference_endpoint/endpoint_client/http.py
    • Made the 'release' method in 'ConnectionPool' idempotent.
    • Updated the type hint for 'InFlightRequest.connection' to remove 'Optional' and added a type ignore.
  • src/inference_endpoint/endpoint_client/http_client.py
    • Renamed 'AsyncHttpEndpointClient' to 'HTTPEndpointClient'.
    • Introduced synchronous 'poll()' and 'drain()' methods for non-blocking response retrieval.
    • Refactored 'shutdown()' to be synchronous and call an internal async shutdown method.
    • Removed redundant 'assert' statements for configuration attributes.
    • Removed the separate 'HTTPEndpointClient' class that provided a sync wrapper, as the main client is now synchronous.
  • src/inference_endpoint/endpoint_client/http_sample_issuer.py
    • Removed redundant 'assert' statement for 'self.http_client.loop'.
  • src/inference_endpoint/endpoint_client/worker.py
    • Updated type hints for internal HTTP and IPC components to remove 'Optional' and added type ignores.
    • Removed redundant 'assert' statements for various initialized components.
    • Added a more specific assertion message for 'event_logs_dir' when event recording is enabled.
    • Streamlined request preparation and firing logic.
    • Simplified connection release logic in response handling methods by removing explicit 'req.connection = None'.
  • src/inference_endpoint/evaluation/scoring.py
    • Replaced 'orjson' imports with 'msgspec.json'.
    • Updated JSON loading and decoding calls to use 'msgspec.json'.
    • Modified exception handling to catch 'msgspec.DecodeError'.
  • src/inference_endpoint/load_generator/session.py
    • Replaced 'orjson' imports with 'msgspec.json'.
    • Updated JSON encoding and formatting for runtime settings and sample index map files.
  • src/inference_endpoint/metrics/recorder.py
    • Replaced 'orjson' imports with 'msgspec.json'.
    • Updated JSON encoding calls to use 'msgspec.json'.
    • Modified exception handling to catch 'msgspec.EncodeError'.
  • src/inference_endpoint/metrics/reporter.py
    • Replaced 'orjson' imports with 'msgspec.json'.
    • Updated comments regarding numpy dtypes and 'msgspec'.
    • Updated JSON encoding and formatting calls to use 'msgspec.json'.
    • Modified exception handling to catch 'msgspec.DecodeError'.
  • src/inference_endpoint/openai/openai_adapter.py
    • Removed 'orjson' import.
    • Updated JSON encoding and decoding methods to use 'msgspec.json'.
  • src/inference_endpoint/utils/benchmark_httpclient.py
    • Replaced 'AsyncHttpEndpointClient' with 'HTTPEndpointClient' in client creation.
    • Simplified the client shutdown call.
  • tests/conftest.py
    • Replaced 'orjson' imports with 'msgspec.json'.
    • Updated JSON encoding calls to use 'msgspec.json'.
  • tests/integration/endpoint_client/test_http_client.py
    • Added helper functions '_create_client' and '_make_query'.
    • Introduced new test classes 'TestPoll', 'TestRecv', 'TestDrain', and 'TestShutdown' to enhance coverage of client response retrieval and shutdown mechanisms.
  • tests/unit/metrics/test_recorder.py
    • Replaced 'orjson' imports with 'msgspec.json'.
    • Updated JSON encoding and decoding calls to use 'msgspec.json'.
  • tests/unit/metrics/test_reporter.py
    • Replaced 'orjson' imports with 'msgspec.json'.
    • Updated JSON encoding calls to use 'msgspec.json'.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@viraatc viraatc merged commit 0ceafb4 into main Mar 9, 2026
11 of 14 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 9, 2026
@viraatc viraatc deleted the feat/viraatc-cleanup-asserts branch March 9, 2026 22:55
@viraatc viraatc review requested due to automatic review settings March 23, 2026 22:53
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants