fix: CLI crash + auto-generate YAML config templates from schema#237
fix: CLI crash + auto-generate YAML config templates from schema#237
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Pull request overview
Fixes a cyclopts CLI parsing crash triggered when --load-pattern is combined with load-pattern subfields like --target-qps / --concurrency in the online benchmark command.
Changes:
- Annotates
LoadPatternto adjust how cyclopts maps nested parameters (@cyclopts.Parameter(name="*")). - Updates the CLI parameter definition for
LoadPattern.typeto avoid the prior name collision.
Comments suppressed due to low confidence (1)
src/inference_endpoint/config/schema.py:360
- This change is a regression fix for a CLI crash when combining
--load-patternwith nested load-pattern fields (e.g.--target-qps). There’s existing automated test coverage for config validation intests/unit/commands/test_benchmark.py, but no test currently exercises cyclopts parsing for this flag combination.
Add a regression test that parses benchmark online ... --load-pattern poisson --target-qps 100 (or directly parses OnlineBenchmarkConfig via cyclopts) and asserts it no longer raises and that config.settings.load_pattern.type/target_qps are set as expected.
@cyclopts.Parameter(name="*")
class LoadPattern(BaseModel):
"""Load pattern configuration.
Different patterns use target_qps differently:
- max_throughput: target_qps used for calculating total queries (offline, optional with default)
- poisson: target_qps sets scheduler rate (online, required - validated)
- concurrency: issue at fixed target_concurrency (online, required - validated)
"""
model_config = ConfigDict(extra="forbid", frozen=True)
type: Annotated[
LoadPatternType,
cyclopts.Parameter(name="--load-pattern", help="Load pattern type"),
] = LoadPatternType.MAX_THROUGHPUT
target_qps: Annotated[
float | None, cyclopts.Parameter(alias="--target-qps", help="Target QPS")
] = Field(None, gt=0)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request modifies the LoadPattern class in the configuration schema by applying a class-level cyclopts.Parameter decorator and updating the type field's parameter definition to use the name argument instead of alias. I have no feedback to provide.
arekay-nv
left a comment
There was a problem hiding this comment.
Can we also add a test for this - seems like a change that shouldn't have gone in.
90fe9c8 to
80a79ef
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Review Council — Multi-AI Code ReviewReviewed by: Claude (Codex ran but produced investigation output, not structured findings) | Depth: standard Found 3 issues across 3 files:
Also addressed all Copilot review comments (pinned SHAs, quoted pip install, heredoc for inline Python, expanded pre-commit |
8915750 to
ffb87d9
Compare
| type: offline | ||
| model_params: | ||
| name: "meta-llama/Llama-3.1-8B-Instruct" | ||
| name: '<MODEL_NAME eg: meta-llama/Llama-3.1-8B-Instruct>' |
There was a problem hiding this comment.
Templates auto-generated from schema defaults by scripts/regenerate_templates.py.
Full YAML spec with placeholder overrides (model name, dataset)
Pre-commit validates templates are valid locally.
CI checks if they're up to date — if stale it will suggest to, run python scripts/regenerate_templates.py.
Is this overkill? Should we drop?
There was a problem hiding this comment.
This creates more burden to the end user to understand all the flags, and we should keep the template simple. cc: @arekay-nv to review as well
There was a problem hiding this comment.
I think this makes reproduction easier. Only the flags with values in <> need to be specified, all others have defaults.
We can even make this part of a pre-commit to have an integration test that takes in these templates, substitutes for an echo server just to make sure that the minimal version runs.
For beginners, they can always use the minimal command line which already has the defaults baked in.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/inference_endpoint/config/templates/concurrency_template.yaml
Outdated
Show resolved
Hide resolved
ffb87d9 to
b781ff7
Compare
Falls back to handwritten _template.yaml when _template_full.yaml doesn't exist (eval, submission). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def _build_minimal_template(test_type: TestType, overrides: dict) -> dict: | ||
| """Only the fields a user must provide. 1 dataset example.""" | ||
| name = overrides.pop("name", None) or f"{test_type.value}_benchmark" | ||
| data: dict[str, object] = { | ||
| "name": name, | ||
| "type": test_type.value, | ||
| "model_params": {"name": _PLACEHOLDER_MODEL}, | ||
| "datasets": [_PERF_DATASET], | ||
| "endpoint_config": {"endpoints": _PLACEHOLDER_ENDPOINTS_MINIMAL}, | ||
| } | ||
|
|
||
| if overrides: | ||
| data = _deep_merge(data, overrides) | ||
|
|
||
| return data | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Main | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| def main(check_only: bool = False): | ||
| """Regenerate templates, or check they're up to date (--check flag).""" | ||
| comments = _collect_options(BenchmarkConfig) | _EXTRA_COMMENTS | ||
| stale = False | ||
|
|
||
| for name, (test_type, overrides) in _TEMPLATES.items(): | ||
| model_cls = _MODEL_FOR_TYPE[test_type] | ||
|
|
||
| variants = { | ||
| f"{name}_template.yaml": _build_minimal_template(test_type, overrides), | ||
| f"{name}_template_full.yaml": _build_full_template(model_cls, overrides), | ||
| } |
There was a problem hiding this comment.
In _build_minimal_template(), overrides is mutated via overrides.pop("name", ...). Since main() passes the same overrides dict to both minimal and full builders, this side effect drops the "name" override for the full template (and also mutates the module-level _TEMPLATES mapping). Avoid mutating overrides (e.g., use overrides.get("name") and merge without popping) and/or pass a copy/deepcopy of overrides into each builder so full/minimal variants are independent.
| name: online_benchmark | ||
| version: '1.0' | ||
| type: online |
There was a problem hiding this comment.
This full concurrency template is named "online_benchmark" even though the minimal concurrency template and CLI type imply it should be a concurrency-specific template (e.g., "concurrency_benchmark"). Regenerating after fixing the overrides mutation in scripts/regenerate_templates.py should produce a consistent name here.
| name: online_benchmark | |
| version: '1.0' | |
| type: online | |
| name: concurrency_benchmark | |
| version: '1.0' | |
| type: concurrency |
|
|
||
| # Cap total samples so test finishes in seconds | ||
| data.setdefault("settings", {}) | ||
| data["settings"].setdefault("runtime", {}) | ||
| data["settings"]["runtime"]["n_samples_to_issue"] = 10 | ||
|
|
There was a problem hiding this comment.
_resolve_template() caps n_samples_to_issue, but it doesn't override settings.client defaults. Because HTTPClientConfig resolves num_workers=-1 to an auto value (often 10+) during validation, these template E2E runs can spawn many worker processes and become slow/flaky in CI. Consider forcing small client settings here (e.g., num_workers=1, warmup_connections=0, max_connections=10) and/or disabling enable_cpu_affinity for the generated config used in this test.
| Config templates in `src/inference_endpoint/config/templates/` are auto-generated from schema defaults. When you change `config/schema.py`, regenerate them: | ||
|
|
||
| ```bash | ||
| python scripts/regenerate_templates.py | ||
| ``` | ||
|
|
||
| The pre-commit hook auto-regenerates templates when `schema.py`, `config.py`, or `regenerate_templates.py` change. CI validates templates are up to date via `--check` mode. | ||
|
|
There was a problem hiding this comment.
Docs mention a pre-commit trigger on config.py, but the actual hook targets src/inference_endpoint/endpoint_client/config.py (and other files). Please update this list so contributors know which changes will regenerate templates.
- Reverted eval/submission from auto-generation (schema doesn't support them yet). Handwritten templates kept, init falls back to them. - regenerate_templates.py auto-detects CI env var → check-only mode - Pre-commit files trigger now includes templates/ dir Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def _build_minimal_template(test_type: TestType, overrides: dict) -> dict: | ||
| """Only the fields a user must provide. 1 dataset example.""" | ||
| name = overrides.pop("name", None) or f"{test_type.value}_benchmark" | ||
| data: dict[str, object] = { | ||
| "name": name, | ||
| "type": test_type.value, | ||
| "model_params": {"name": _PLACEHOLDER_MODEL}, | ||
| "datasets": [_PERF_DATASET], | ||
| "endpoint_config": {"endpoints": _PLACEHOLDER_ENDPOINTS_MINIMAL}, | ||
| } | ||
|
|
||
| if overrides: | ||
| data = _deep_merge(data, overrides) | ||
|
|
||
| return data | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Main | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| def main(check_only: bool = False): | ||
| """Regenerate templates, or check they're up to date. | ||
|
|
||
| Locally (pre-commit): regenerates files, pre-commit detects the diff. | ||
| CI: auto-detects ``CI`` env var and switches to check-only mode. | ||
| Explicit: ``--check`` flag forces check-only mode. | ||
| """ | ||
| if os.environ.get("CI"): | ||
| check_only = True | ||
| comments = _collect_options(BenchmarkConfig) | _EXTRA_COMMENTS | ||
| stale = False | ||
|
|
||
| for name, (test_type, overrides) in _TEMPLATES.items(): | ||
| model_cls = _MODEL_FOR_TYPE[test_type] | ||
|
|
||
| variants = { | ||
| f"{name}_template.yaml": _build_minimal_template(test_type, overrides), | ||
| f"{name}_template_full.yaml": _build_full_template(model_cls, overrides), | ||
| } |
There was a problem hiding this comment.
_build_minimal_template mutates the shared overrides dict via overrides.pop("name", ...). In main(), the same overrides instance is then reused to build the full template, so the full template can silently lose overrides (e.g., concurrency template loses its name override and ends up as online_benchmark). Avoid mutating the input overrides: pass a copy per variant (deepcopy if nested) and/or change _build_minimal_template to read overrides.get("name") without popping.
| @@ -0,0 +1,86 @@ | |||
| name: online_benchmark | |||
There was a problem hiding this comment.
name is online_benchmark here, but this is the concurrency full template. It should match the concurrency variant (e.g. concurrency_benchmark) so users don’t start from a misleading config name.
| name: online_benchmark | |
| name: concurrency_benchmark |
templates/ → templates/.* so the regex matches actual file paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…plates Generates cleaner YAML — all defaults present, no null noise. eval/submission still copy handwritten templates (schema not ready). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
scripts/regenerate_templates.py
Outdated
|
|
||
| def _build_minimal_template(test_type: TestType, overrides: dict) -> dict: | ||
| """Only the fields a user must provide. 1 dataset example.""" | ||
| name = overrides.pop("name", None) or f"{test_type.value}_benchmark" |
There was a problem hiding this comment.
_build_minimal_template() mutates the shared overrides dict via pop("name", ...). Since main() passes the same overrides instance to both the minimal and full variants, the full template can silently lose overrides (e.g., the concurrency full template’s name override). Treat overrides as immutable here (don’t pop), or pass a deep copy into each builder.
| name = overrides.pop("name", None) or f"{test_type.value}_benchmark" | |
| name = overrides.get("name", None) or f"{test_type.value}_benchmark" |
| output_path = f"{template_type}_template.yaml" | ||
| output_file = Path(output_path) | ||
|
|
There was a problem hiding this comment.
The init command currently always writes <mode>_template.yaml (and for offline/online/concurrency it generates a model_dump(exclude_none=True)), which doesn’t match the repo’s auto-generated “full” templates (*_template_full.yaml with placeholders/options comments) described in the PR/docs. If init is intended to generate the full template, consider copying <mode>_template_full.yaml from TEMPLATES_DIR (or reusing the regeneration logic) instead of emitting this minimal/partial variant.
Walks BenchmarkConfig and all nested models, renders markdown tables with field name, type, default, and description. Pre-commit hook auto-regenerates on schema changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for name, (test_type, overrides) in _TEMPLATES.items(): | ||
| model_cls = _MODEL_FOR_TYPE[test_type] | ||
|
|
||
| variants = { | ||
| f"{name}_template.yaml": _build_minimal_template(test_type, overrides), | ||
| f"{name}_template_full.yaml": _build_full_template(model_cls, overrides), | ||
| } |
There was a problem hiding this comment.
overrides from _TEMPLATES is passed into both _build_minimal_template(...) and _build_full_template(...) but is the same dict object. Because the minimal builder mutates it, the full builder can produce incorrect output. Consider copying overrides (or otherwise ensuring immutability) per variant before calling the builders.
| """Generate config template. | ||
| Args: | ||
| template: Template type (offline, online, concurrency). | ||
| """ |
There was a problem hiding this comment.
The init command docstring claims valid template types are only (offline, online, concurrency), but execute_init() also accepts eval and submission. Either include eval/submission in the docstring (and CLI help) or remove them from the supported types to avoid misleading users.
| """Generate YAML config template. | ||
|
|
||
| For offline/online/concurrency: generates via model_dump(exclude_none=True). | ||
| For eval/submission: copies handwritten template files. | ||
|
|
||
| if template_type not in TEMPLATE_FILES: | ||
| Args: | ||
| template_type: One of "offline", "online", "concurrency", "eval", "submission". | ||
| """ | ||
| if template_type not in VALID_TYPES: | ||
| raise InputValidationError( | ||
| f"Unknown template type: {template_type}. " | ||
| f"Available: {', '.join(TEMPLATE_FILES.keys())}" | ||
| f"Available: {', '.join(sorted(VALID_TYPES))}" | ||
| ) | ||
|
|
||
| template_file = TEMPLATES_DIR / TEMPLATE_FILES[template_type] | ||
| output_path = f"{template_type}_template.yaml" | ||
| output_file = Path(output_path) | ||
|
|
||
| if output_file.exists(): | ||
| logger.warning(f"File exists: {output_path} (will be overwritten)") | ||
|
|
||
| try: | ||
| if not template_file.exists(): | ||
| logger.info("Generating from BenchmarkConfig.create_default_config...") | ||
| config = BenchmarkConfig.create_default_config( | ||
| TEMPLATE_TYPE_MAP[template_type] | ||
| ) | ||
| config_dict = config.model_dump(exclude_none=True) | ||
| with open(output_path, "w") as f: | ||
| yaml.dump(config_dict, f, default_flow_style=False, sort_keys=False) | ||
| logger.info(f"Generated: {output_path}") | ||
| else: | ||
| # TODO(vir): | ||
| # generate these automatically when support is added | ||
| # for now just copy over hand-written templates | ||
| if template_type in _HANDWRITTEN: | ||
| template_file = TEMPLATES_DIR / _HANDWRITTEN[template_type] | ||
| if not template_file.exists(): | ||
| raise SetupError(f"Template file not found: {template_file}") | ||
| shutil.copy(template_file, output_path) | ||
| logger.info(f"Created from template: {output_path}") | ||
|
|
||
| except NotImplementedError as e: | ||
| logger.error(str(e)) | ||
| if template_file.exists(): | ||
| shutil.copy(template_file, output_path) | ||
| logger.info(f"Created from template: {output_path}") | ||
| else: | ||
| raise SetupError(f"Template file not found: {template_file}") from e | ||
| config = BenchmarkConfig.create_default_config(_TYPE_MAP[template_type]) | ||
| if template_type == "concurrency": | ||
| config = config.with_updates( | ||
| name="concurrency_benchmark", | ||
| settings=OnlineSettings( | ||
| load_pattern=LoadPattern( | ||
| type=LoadPatternType.CONCURRENCY, | ||
| target_concurrency=32, | ||
| ), | ||
| ), | ||
| ) | ||
| data = config.model_dump(mode="json", exclude_none=True) | ||
| with open(output_path, "w") as f: | ||
| yaml.dump(data, f, default_flow_style=False, sort_keys=False) | ||
| logger.info(f"Created: {output_path}") |
There was a problem hiding this comment.
execute_init() currently writes <type>_template.yaml by dumping BenchmarkConfig.create_default_config(...) with exclude_none=True, which omits optional fields that default to None. This does not match the PR description/docs that say init “always generates the full template” / “all fields”, nor does it match the committed _template_full.yaml variants (and placeholder format differs: <MODEL_NAME> vs <MODEL_NAME eg: ...>). Consider either generating the full variant (e.g., exclude_none=False and/or writing <type>_template_full.yaml) or updating the docs/PR summary to reflect the actual behavior.
- Pin actions/checkout and setup-python to SHAs in CI - Quote pip install extras to avoid shell globbing - Add comment to empty except in fuzz test - Fix overrides.pop() → .get() to prevent dict mutation across minimal/full template builders (concurrency_template_full.yaml was getting wrong name) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Full templates now show CLI help text as inline comments for every field that has a description (from Field(description=) or cyclopts.Parameter(help=)) - Minimal templates include settings.runtime (min_duration_ms, max_duration_ms, n_samples_to_issue) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """Generate config template. | ||
| Args: | ||
| template: Template type (offline, online, concurrency). |
There was a problem hiding this comment.
The init command docstring lists only (offline, online, concurrency), but execute_init() accepts eval and submission too (and the CLI error message advertises them). Please update the docstring to reflect the full set of supported template types, or remove support if it’s intentionally hidden.
| template: Template type (offline, online, concurrency). | |
| template: Template type (offline, online, concurrency, eval, submission). |
| @pytest.mark.parametrize( | ||
| "template", | ||
| [ | ||
| "concurrency_template.yaml", | ||
| "eval_template.yaml", | ||
| "offline_template.yaml", | ||
| "online_template.yaml", | ||
| "submission_template.yaml", | ||
| ], | ||
| sorted( | ||
| p.name | ||
| for p in ( | ||
| Path(__file__).parent.parent.parent.parent | ||
| / "src" | ||
| / "inference_endpoint" | ||
| / "config" | ||
| / "templates" | ||
| ).glob("*_template*.yaml") | ||
| ), |
There was a problem hiding this comment.
This parametrization becomes a no-op if the glob matches zero files (pytest will collect 0 cases and the template validation won’t run). To avoid silently skipping template validation, consider asserting the discovered list is non-empty (and/or using the existing TEMPLATE_DIR.glob(...) to avoid duplicating the templates path).
| # Generate config templates | ||
| inference-endpoint init offline # or: online, eval, submission | ||
| inference-endpoint init submission | ||
| # Generate config template (all fields with defaults) |
There was a problem hiding this comment.
This line claims init generates a template with “all fields with defaults”, but the implementation currently dumps with exclude_none=True, which omits fields whose defaults are None. Either adjust the wording (e.g., “common defaults”) or update init to emit the full template including null defaults.
| # Generate config template (all fields with defaults) | |
| # Generate config template (common defaults; fields with null/None defaults may be omitted) |
Summary
--load-pattern+--target-qps(IndexError) —LoadPattern.typeusedalias=instead ofname=oncyclopts.Parameterregenerate_templates.pyauto-generates minimal + full YAML templates from Pydantic schema defaults--checkmodeTemplates
Two variants per mode (offline, online, concurrency):
_template.yaml— minimal: only required fields +<PLACEHOLDER eg: example>values_template_full.yaml— all fields with schema defaults, inline# options:comments auto-discovered from Enum/Literal annotations, 2 dataset examples (perf + accuracy with parser columns), multiple endpointsInit command
inference-endpoint init offlinealways generates the full template. Available types:offline,online,concurrency.Tests
from_yaml_file(auto-discovered with glob)<>placeholders stripped to real values, and run end-to-end against the echo server — verifying templates produce functional configsDocs updated
Test plan
python scripts/regenerate_templates.py --checkpassespytest tests/unit/— 484 passedpytest tests/integration/commands/test_benchmark_command.py::TestTemplateIntegration— 6 templates run e2e🤖 Generated with Claude Code