Skip to content

doc: Documentation cleanup, refactor#204

Open
arekay-nv wants to merge 14 commits intomainfrom
arekay/doc-refactor
Open

doc: Documentation cleanup, refactor#204
arekay-nv wants to merge 14 commits intomainfrom
arekay/doc-refactor

Conversation

@arekay-nv
Copy link
Copy Markdown
Collaborator

@arekay-nv arekay-nv commented Mar 24, 2026

What does this PR do?

Update the docs.

Type of change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor/cleanup

Related issues

Testing

  • Tests added/updated
  • All tests pass locally
  • Manual testing completed

Checklist

  • Code follows project style
  • Pre-commit hooks pass
  • Documentation updated (if needed)

@arekay-nv arekay-nv requested a review from a team as a code owner March 24, 2026 23:35
Copilot AI review requested due to automatic review settings March 24, 2026 23:35
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 24, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request undertakes a major overhaul of the project's documentation, aiming to provide a more structured, comprehensive, and accessible knowledge base. The changes centralize development guidelines, introduce detailed design specifications for core components, and improve the overall navigation and clarity of the project's technical information. This refactor will significantly aid new contributors in onboarding and help all developers understand the system's architecture and component interactions more deeply.

Highlights

  • Comprehensive Documentation Refactor: Existing documentation files like AGENTS.md, CONTRIBUTING.md, README.md, CLI_QUICK_REFERENCE.md, DEVELOPMENT.md, GITHUB_SETUP.md, and LOCAL_TESTING.md have been significantly updated and streamlined for clarity and consistency.
  • Introduction of Component Design Specifications: A new set of detailed Design.md documents has been added for nearly every major component of the system, including async_utils, commands, config, core, dataset_manager, endpoint_client, evaluation, load_generator, metrics, openai, plugins, profiling, sglang, testing, and utils. These provide in-depth architectural and implementation details.
  • Enhanced Development Guide: The DEVELOPMENT.md file has been substantially expanded to cover development environment setup, project structure, testing, code quality standards (including pre-commit hooks), and a detailed pull request process, serving as a central resource for contributors.
  • Updated Examples and Architecture Overview: The examples/README.md now includes new example configurations, and the main README.md features a more structured documentation section with links to the new design specs and an updated architecture diagram, improving discoverability and understanding of the system's components.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly updates and expands the project's documentation, introducing detailed design specifications for various components in new Design.md files, revising development setup and contributing guidelines, and clarifying CLI usage. Notable changes include a more comprehensive "Documentation" section in README.md with links to these new design specs, updated development environment setup in DEVELOPMENT.md to reflect new modules and pre-commit hook details, and refined CLI examples in CLI_QUICK_REFERENCE.md and LOCAL_TESTING.md. Review feedback indicates a potential documentation duplication for the "Endpoint client" component, suggesting a need to clarify or deprecate the older document, and also highlights an inconsistency in the CLI_QUICK_REFERENCE.md regarding the --report-dir option for the from-config subcommand, requiring clarification.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR performs a broad documentation cleanup and restructuring, adding per-component “Design Spec” documents and refreshing contributor and usage guides to reflect the current repository structure and CLI workflows.

Changes:

  • Add new component-level design specs under docs/<component>/Design.md to describe architecture, responsibilities, and integration points.
  • Refresh contributor and usage documentation (local testing, development workflow, CLI reference, GitHub setup) to match current commands and repository links.
  • Expand/curate example listings and cross-link documentation from the root README.md.

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
examples/README.md Adds links/descriptions for new example directories.
docs/utils/Design.md New utils design spec (currently contains several API/behavior mismatches vs code).
docs/testing/Design.md New testing utilities design spec (echo/max/variable throughput servers).
docs/sglang/Design.md New SGLang adapter design spec (currently documents non-existent adapter/accumulator methods).
docs/profiling/Design.md New profiling design spec (line_profiler + pytest flag).
docs/plugins/Design.md New plugin namespace design spec (currently overstates existing plugin registration API).
docs/openai/Design.md New OpenAI adapter design spec (currently documents non-existent adapter/accumulator methods).
docs/metrics/Design.md New metrics design spec (EventRecorder API signature section currently mismatches implementation).
docs/load_generator/Design.md New load generator design spec (session/scheduler architecture).
docs/evaluation/Design.md New evaluation design spec (accuracy scoring + LiveCodeBench).
docs/endpoint_client/Design.md New endpoint client design spec (worker pool, ZMQ IPC, adapters).
docs/dataset_manager/Design.md New dataset manager design spec (loader/transforms/presets; format inference list currently incomplete).
docs/core/Design.md New core types design spec (currently mismatches actual Query/QueryResult struct fields).
docs/config/Design.md New config design spec (YAML/CLI → RuntimeSettings, templates, rulesets).
docs/commands/Design.md New commands layer design spec (CLI layout and command flow).
docs/async_utils/services/metrics_aggregator/Design.md Adds a clearer one-line summary of the metrics aggregator service.
docs/async_utils/services/event_logger/Design.md Adds a clearer one-line summary of the event logger service.
docs/async_utils/services/Design.md Adds a clearer one-line summary of the pub/sub system design doc.
docs/async_utils/Design.md New async_utils design spec (loop manager, ZMQ transport, pub/sub services).
docs/LOCAL_TESTING.md Updates local testing instructions (dataset path, init syntax, default duration, supported formats list).
docs/GITHUB_SETUP.md Updates GitHub workflow descriptions and branch protection checklist.
docs/ENDPOINT_CLIENT.md Adds a short introductory summary line for the document.
docs/DEVELOPMENT.md Major rewrite of development workflow guide (fork/upstream flow, tooling, formatting, links).
docs/CLI_QUICK_REFERENCE.md Replaces intro text and adjusts examples/wording for config-driven usage.
README.md Updates docs links, architecture diagram, LiveCodeBench link, and minor command snippets.
CONTRIBUTING.md Adds pointer to docs/DEVELOPMENT.md for standards.
AGENTS.md Simplifies setup section and updates repo structure/tooling references.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Collaborator

@nvzhihanj nvzhihanj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Council — Multi-AI Code Review

Reviewed by: Claude (Codex unavailable) | Depth: thorough

Found 9 issues across 5 files.

Note: 17 existing inline comments were already present. Overlapping issues excluded.

@nvzhihanj
Copy link
Copy Markdown
Collaborator

Review Council — Multi-AI Code Review

Reviewed by: Claude (Codex unavailable — branch checkout failed) | Depth: thorough

Found 9 issues across 5 files. All verified against actual source code.

17 existing inline comments from a prior review were already present. Overlapping issues have been excluded from this review.

🔴 Must Fix (high) — 3 issues

Issues where the documentation will actively mislead readers about how the code works.

# File Line Category Summary
1 docs/load_generator/Design.md 130 bug Event types table uses SESSION_STARTED/SESSION_ENDED — actual enums are TEST_STARTED/TEST_ENDED
2 docs/metrics/Design.md 131 bug QPS formula references non-existent SESSION_ENDED/SESSION_STARTED events; actual uses TEST_STARTED and STOP_PERFORMANCE_TRACKING
3 docs/metrics/Design.md 72 api-contract MetricsReporter.__init__ signature wrong — actual params are (connection_name, client_type), not (db_path, runtime_settings, tokenizer)

🟡 Should Fix (medium) — 4 issues

Real inaccuracies that could cause confusion under specific circumstances.

# File Line Category Summary
4 docs/metrics/Design.md 79 api-contract create_report() returns Report not BenchmarkReport; takes tokenizer and tpot_reporting_mode params
5 docs/metrics/Design.md 91 api-contract Metric classes use REL_TOL class var and self.target, not tolerance field or target_qps/target_ms/max_ms attributes
6 docs/async_utils/Design.md 72 api-contract EventPublisherService requires managed_zmq_context arg on first construction — doc shows no-arg usage
7 docs/load_generator/Design.md 63 api-contract SampleIssuer is an ABC (requires subclassing), not a Protocol (structural subtyping) — doc says "protocol"

🔵 Consider (low) — 2 issues

Valid improvements that could be follow-ups.

# File Line Category Summary
8 docs/config/Design.md 30 bug RuntimeSettings table missing metric_target: Metric field
9 AGENTS.md 137 design CLI row says argparse-based but project migrated to cyclopts in PR #193 — appears based on stale branch

🤖 Generated with Claude Code — Review Council

Copilot AI review requested due to automatic review settings March 30, 2026 18:18
@arekay-nv arekay-nv force-pushed the arekay/doc-refactor branch from 0db4683 to 7196894 Compare March 30, 2026 18:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings March 31, 2026 14:50
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add Design.md specs for all 15 top-level components under src/inference_endpoint/
- Restructure AGENTS.md: move code style details to DEVELOPMENT.md, update
  component table with runner.py and async_utils services
- Update README.md: add Component Design Specs table, use python3 in examples
- Reformat DEVELOPMENT.md: remove emojis, add commit type list, exact-version
  pinning guidance
- Update CLI_QUICK_REFERENCE.md, LOCAL_TESTING.md, ENDPOINT_CLIENT.md,
  GITHUB_SETUP.md for consistency
- Fix stale references: pkl→jsonl throughout, CLIError for eval mode,
  dataset_manager Design.md reflects current supported formats

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@arekay-nv arekay-nv force-pushed the arekay/doc-refactor branch from 465f559 to 9a7697b Compare April 2, 2026 00:37
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 35 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (2)

docs/async_utils/services/Design.md:21

  • This section says in-process subscribers “connect to the publisher’s bind_address”. With the current ZMQ helpers, ZmqEventRecordSubscriber expects the socket name/path (e.g. publisher.bind_path / ev_pub_<uuid>), not the full ipc://... bind address; passing bind_address would result in an invalid constructed address because ManagedZMQContext.connect() prepends socket_dir again. Please update the docs to reference the socket name (bind_path) for in-process subscribers, and reserve bind_address for cases where subscribers connect without ManagedZMQContext address construction.
    docs/async_utils/services/Design.md:306
  • The event_logger lifecycle description claims error events after session.ended are still written. The current EventLoggerService.process() stops writing all subsequent records once _shutdown_received is set, regardless of event type (src/inference_endpoint/async_utils/services/event_logger/__main__.py, process() method). Either update this doc to match current behavior, or adjust EventLoggerService to continue writing ErrorEventType.* records after ENDED if that behavior is intended.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 3, 2026 03:34
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 35 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

docs/async_utils/services/Design.md:22

  • This section says in-process subscribers connect using the publisher's bind_address, but the ZMQ APIs take the socket name (publisher bind_path, e.g. ev_pub_<uuid>) and ManagedZMQContext.connect() constructs the full ipc://... address. Using bind_address directly would not work with connect() as implemented. Suggest updating wording to match the actual API (path=publisher.bind_path).
    docs/async_utils/services/Design.md:306
  • Doc/code mismatch: this claims the event logger continues writing error events after session.ended, but EventLoggerService.process() currently drops all subsequent records once _shutdown_received is set (see src/inference_endpoint/async_utils/services/event_logger/__main__.py, where it continues when _shutdown_received). Either update the implementation to allow ErrorEventType through after ENDED, or adjust this doc to reflect current behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arekay-nv
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly expands the project's documentation by adding detailed design specifications for all core components, including the load generator, metrics aggregator, and endpoint client. It also introduces new AI agent skills for msgspec optimization and garbage collection safety. Review feedback identified a missing required argument in a documentation code example and a missing field in the metrics aggregator's data model description.

@arekay-nv arekay-nv requested review from nvzhihanj and viraatc April 3, 2026 21:42
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 6, 2026 19:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 36 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
@nvzhihanj nvzhihanj added type: documentation Documentation only priority: P3 Low — backlog, nice to have labels Apr 7, 2026
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 7, 2026 22:48
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 35 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

docs/async_utils/services/Design.md:22

  • The out-of-process subscriber description suggests using a “shared socket directory” like socket_dir=log_dir.parent, but IPC subscribers must use the publisher’s ManagedZMQContext.socket_dir (the directory where the PUB socket was actually bound). If the directory differs, ctx.connect(..., socket_name) will point at a non-existent IPC path. Recommend rewording this to emphasize that the parent process must pass the publisher’s socket_dir to child processes via --socket-dir.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nv-alicheng
Copy link
Copy Markdown
Collaborator

nit: Can we use all-caps or all-lowercase for design.md? Design.md seems weird to me.

@arekay-nv arekay-nv requested a review from a team April 8, 2026 19:50
@@ -0,0 +1,49 @@
# Utils — Design Spec

> Shared helpers (logging setup, version, tokenizer utilities) and a standalone HTTP benchmarking tool. The core helper modules have no dependencies on other project subpackages.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core helper modules have no dependencies on other project subpackages.

This sounds misleading or misplaced - this sounds like it should be in core/DESIGN.md. Maybe it should say 'Utility classes and functions have no dependencies on other project components'?


> Async infrastructure shared across the system: uvloop event loop lifecycle management, ZMQ-based IPC transport between processes, and a pub/sub event bus for real-time metric streaming.

**Component specs:** **async_utils** · [commands](../commands/Design.md) · [config](../config/Design.md) · [core](../core/Design.md) · [dataset_manager](../dataset_manager/Design.md) · [endpoint_client](../endpoint_client/Design.md) · [evaluation](../evaluation/Design.md) · [load_generator](../load_generator/Design.md) · [metrics](../metrics/Design.md) · [openai](../openai/Design.md) · [plugins](../plugins/Design.md) · [profiling](../profiling/Design.md) · [sglang](../sglang/Design.md) · [testing](../testing/Design.md) · [utils](../utils/Design.md)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this ToC header is cool, but how necessary is this in every file? It's not like a website side-bar where you can click it, if you're reading in a text editor or on GitHub webview you'd still need to scroll up to the top.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also makes it hard to maintain and inflates diffs for any changes.

DataLoaderFactory
|
+-- format -> DatafileLoader subclass
| (jsonl / json / csv / parquet / hf)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment here is weird

```python
class DataLoaderFactory:
@staticmethod
def create_loader(config: DatasetConfig, num_repeats: int = 1, **kwargs) -> Dataset
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent formatting compared to everywhere else (no ...)

┌─────────────────────────────────────────┐
│ HTTPEndpointClient │
│ ├── uvloop event loop │
│ └── WorkerManager │
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misaligned ASCII art

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you delete this file? Shouldn't be used anymore.

Dataset Manager ──► Load Generator ──► Endpoint Client ──► External Endpoint
Metrics Collector
(EventRecorder + MetricsReporter)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lest remove class names from top-level README (EventRecorder, MetricsReporter), additional detail not required here.

--report-dir official_results
--config submission_template.yaml
# Note: from-config only accepts --config, --timeout, and --mode via CLI.
# Set report_dir in the YAML if you need a specific output location.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a bug?
pl. assign issue to me if so i can patch this quickly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority: P3 Low — backlog, nice to have type: documentation Documentation only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants