[Build] Add Benchmark dependencies to requirements/common.txt #19145

simon-mo · 2025-06-04T13:52:00Z

Summary

add pandas and datasets to common requirements for benchmark support otherwise a clean install will fail

https://chatgpt.com/codex/tasks/task_e_68404dbc92f4832985a94abfa89e8fe9

gemini-code-assist

Hello @simon-mo, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello! Gemini here, providing a summary of this pull request authored by simon-mo. The primary goal of this PR is to prepare the project for benchmarking by adding necessary dependencies. Specifically, it adds pandas and datasets to the common requirements file, requirements/common.txt. This change is intended to ensure that the environment has the required libraries when running benchmark-related code.

Highlights

Add Benchmark Dependencies: This pull request introduces pandas and datasets as dependencies by adding them to the requirements/common.txt file. These libraries are specified as being needed for the benchmarks module.

Changelog

requirements/common.txt
- Added pandas to the list of common requirements (line 51).
- Added datasets to the list of common requirements (line 52).
- Both dependencies are noted as being required for the benchmarks module.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Deps are added now,
For benchmarks, they are needed,
Code can run faster.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds pandas and datasets as dependencies to support the benchmark module, which is a useful addition. The changes are straightforward, but there are a couple of points regarding dependency management that could be addressed to improve stability and maintainability. Specifically, considering version pinning for these new dependencies and evaluating if they should be optional dependencies rather than part of common.txt.

Summary of Findings

Dependency Version Pinning: The newly added dependencies pandas and datasets in requirements/common.txt are not version-pinned. This could lead to instability or build issues if future versions introduce breaking changes. It's recommended to specify versions (e.g., pandas==X.Y.Z or pandas>=X.Y.Z).
Placement of Benchmark-Specific Dependencies: pandas and datasets are noted as being for the 'benchmarks module'. Including them in common.txt makes them mandatory for all users. Consider making them optional dependencies (e.g., via extras_require in setup.py or a separate requirements-benchmark.txt) to keep the core installation lean.

Merge Readiness

The pull request aims to add necessary dependencies for benchmark support, which is valuable. However, there are a couple of concerns regarding dependency management:

Version Pinning: The new dependencies (pandas, datasets) should have their versions specified to ensure stability and reproducibility.
Optional Dependencies: Since these dependencies appear to be solely for benchmarks, they might be better suited as optional dependencies rather than being added to common.txt, which affects all users.

I recommend addressing these points before merging to maintain code quality and a lean core dependency set. As an AI reviewer, I am not authorized to approve pull requests; please ensure further review and approval from project maintainers after addressing the feedback.

gemini-code-assist · 2025-06-04T13:53:02Z

requirements/common.txt

+pandas  # needed for benchmarks module
+datasets  # needed for benchmarks module


For consistency with other packages in this file (e.g., opentelemetry-sdk>=1.26.0, transformers >= 4.51.1) and to ensure reproducible builds and prevent unexpected breakages from upstream changes, would it be better to pin versions for pandas and datasets?

For example, you could specify the versions you tested with (e.g., pandas==X.Y.Z) or a minimum compatible version (e.g., pandas>=X.Y.Z). This helps maintain stability over time.

pandas==<version_tested_with> # needed for benchmarks module datasets==<version_tested_with> # needed for benchmarks module

both are relatively stable and very widely used thus I prefer not to pin it to avoid conflicts.

gemini-code-assist · 2025-06-04T13:53:03Z

requirements/common.txt

+pandas  # needed for benchmarks module
+datasets  # needed for benchmarks module


Given that pandas and datasets are specifically for the 'benchmarks module' (as per the comments and their usage with PlaceholderModule in vllm/benchmarks/datasets.py), should these be included in common.txt? Adding them here makes them mandatory for all installations of vLLM, even for users who don't run benchmarks.

Could these be moved to an optional dependency group (e.g., defined in setup.py under extras_require={'benchmark': ['pandas', 'datasets']} so users can install with pip install vllm[benchmark]) or a separate requirements file like requirements/benchmark.txt? This would help keep the core dependency set lighter for general users.

But they are imported in CLI on critical path rn.

github-actions · 2025-06-04T14:02:21Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mgoin · 2025-06-04T14:05:32Z

@simon-mo I would prefer to keep these dataset deps separate and optional as achieved in this PR by lazy importing them #19089
The breaking change on main was not intentional.

Add pandas and datasets for benchmarks

9a76ef0

simon-mo added the codex label Jun 4, 2025 — with ChatGPT Connector

gemini-code-assist bot reviewed Jun 4, 2025

View reviewed changes

mergify bot added the ci/build label Jun 4, 2025

simon-mo changed the title ~~Add benchmark deps~~ [Build] Add Benchmark dependencies to requirements/common.txt Jun 4, 2025

gemini-code-assist bot suggested changes Jun 4, 2025

View reviewed changes

simon-mo requested a review from mgoin June 4, 2025 13:53

simon-mo closed this Jun 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Build] Add Benchmark dependencies to requirements/common.txt #19145

[Build] Add Benchmark dependencies to requirements/common.txt #19145

Uh oh!

simon-mo commented Jun 4, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jun 4, 2025

Uh oh!

simon-mo Jun 4, 2025

Uh oh!

gemini-code-assist bot Jun 4, 2025

Uh oh!

simon-mo Jun 4, 2025

Uh oh!

github-actions bot commented Jun 4, 2025

Uh oh!

mgoin commented Jun 4, 2025

Uh oh!

Uh oh!

		pandas # needed for benchmarks module
		datasets # needed for benchmarks module

Uh oh!

[Build] Add Benchmark dependencies to requirements/common.txt #19145

[Build] Add Benchmark dependencies to requirements/common.txt #19145

Uh oh!

Conversation

simon-mo commented Jun 4, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

gemini-code-assist bot Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

simon-mo Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

simon-mo Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 4, 2025

Uh oh!

mgoin commented Jun 4, 2025

Uh oh!

Uh oh!

simon-mo commented Jun 4, 2025 •

edited by github-actions bot

Loading