Conversation
Summary of ChangesHello @viraatc, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust set of performance utilities designed to thoroughly benchmark the HTTP client. It provides a dedicated script for running single performance tests or complex parameter sweeps, complete with live statistics, memory tracking, and CPU affinity controls. A new mock server is also included, allowing for isolated and high-throughput client-side performance measurements, with results automatically visualized through generated plots. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
Adds a high-throughput stub server and a benchmarking script to measure HTTP client send/recv throughput (including sweep + plotting), plus a dependency update to support plotting.
Changes:
- Introduces
MaxThroughputServer: a minimal OpenAI-compatible server returning pre-built responses for roofline-style client benchmarking. - Adds
scripts/benchmark_httpclient.pywith single-run + sweep modes, live stats, optional memory tracking, and plot generation. - Adds
matplotlibto dependencies to support sweep plotting.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
| src/inference_endpoint/testing/max_throughput_server.py | New minimal high-throughput HTTP stub server for isolating client throughput. |
| scripts/benchmark_httpclient.py | New benchmark utility with sweep modes, live stats, restartable local server, and plotting. |
| pyproject.toml | Adds matplotlib dependency (currently under test extras) for plot output. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
The pull request introduces a new performance testing utility for HTTP clients, along with a mock maximum throughput server. The utility supports single runs and parameter sweeps, including CPU affinity pinning and memory tracking. It also adds matplotlib as a dependency for plotting sweep results. The overall structure and functionality appear sound, providing a comprehensive tool for benchmarking. There are a couple of areas where maintainability and efficiency could be improved.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
88dc43d to
d451922
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d451922 to
723eea4
Compare
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
arekay-nv
left a comment
There was a problem hiding this comment.
Thanks for this - definitely useful. Would love to try it out.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/inference_endpoint/testing/max_throughput_server.py:1
- The
_restart_serverfunction accesses private attributes of theMaxThroughputServerclass, creating tight coupling. Consider adding a publicrestartorreconfiguremethod to the server class instead.
# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e77e711 to
8bbb5e6
Compare
8bbb5e6 to
db397d0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
db397d0 to
2cf3680
Compare
2cf3680 to
e15536b
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
65efeb2 to
3465924
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
What does this PR do?
addresses: #9
example:
Type of change
Related issues
Testing
Checklist