Summary
Replace argparse CLI with cyclopts. config/schema.py Pydantic models become the single source of truth — CLI flags are auto-generated from schema fields.
Key Changes
- Discriminated union:
OfflineBenchmarkConfig/OnlineBenchmarkConfig subclasses in config/schema.py. Both CLI and YAML auto-select the right subclass via Pydantic TypeAdapter.
- Dataset string format:
--dataset [perf|acc:]<path>[,key=value...] with TOML-style dotted paths (e.g. parser.prompt=article, accuracy_config.eval_method=pass_at_1)
- Sub-model validation:
RuntimeConfig, LoadPattern, ColumnRemap self-validate. BenchmarkConfig only handles cross-model checks.
- Error formatting: Consistent
Required: --full.path [--alias] format, Pydantic errors cleaned up
PR
#193
Summary
Replace argparse CLI with cyclopts.
config/schema.pyPydantic models become the single source of truth — CLI flags are auto-generated from schema fields.Key Changes
OfflineBenchmarkConfig/OnlineBenchmarkConfigsubclasses inconfig/schema.py. Both CLI and YAML auto-select the right subclass via Pydantic TypeAdapter.--dataset [perf|acc:]<path>[,key=value...]with TOML-style dotted paths (e.g.parser.prompt=article,accuracy_config.eval_method=pass_at_1)RuntimeConfig,LoadPattern,ColumnRemapself-validate.BenchmarkConfigonly handles cross-model checks.Required: --full.path [--alias]format, Pydantic errors cleaned upPR
#193