-
Notifications
You must be signed in to change notification settings - Fork 1
CLI Reference
Abhishek Gahlot edited this page Mar 27, 2026
·
1 revision
All commands via deepgym after installation.
Run a single solution.
deepgym run \
--task "Write a function coin_change(coins, amount)..." \
--verifier verifier.py \
--solution solution.py \
--timeout 30 \
--snapshot python| Flag | Required | Default |
|---|---|---|
--task |
Yes | |
--verifier |
Yes | |
--solution |
Yes | |
--timeout |
No | 30 |
--snapshot |
No |
Run multiple solutions from a directory.
deepgym run-batch \
--task "Write two_sum..." \
--verifier verifier.py \
--solutions-dir ./solutions/ \
--max-parallel 10 \
--timeout 30| Flag | Required | Default |
|---|---|---|
--solutions-dir |
Yes | |
--max-parallel |
No | 10 |
Evaluate across a suite.
deepgym eval \
--suite medium \
--solutions-dir ./solutions/ \
--max-parallel 100Suite names: easy, medium, hard, all, coding, computer-use, tool-use, or family names.
Create and register a new environment (outputs JSON).
deepgym create \
--name my_problem \
--task "Implement binary search" \
--verifier verifier.py \
--difficulty medium \
--domain coding \
--tags search algorithmCheck a verifier for reward hacking vulnerabilities.
deepgym audit \
--task "Write coin_change..." \
--verifier verifier.py \
--benchmark my-benchmark \
--verifier-id v1 \
--strategies empty hardcoded trivial overflow pattern llm_attack \
--persist \
--db-path ~/.deepgym/exploits.db \
--json| Flag | Required | Default |
|---|---|---|
--strategies |
No | all |
--persist |
No | false |
--db-path |
No | ~/.deepgym/exploits.db |
--json |
No | false |
See Adversarial Testing for what each strategy does.
Audit benchmark splits for contamination.
deepgym benchmark-audit \
--env-dir ./environments/ \
--benchmark mydata \
--seed 0 \
--public-eval-ratio 0.2 \
--holdout-ratio 0.1 \
--canary-ratio 0.05 \
--jsonStart the REST API server.
# dev
DEEPGYM_NO_AUTH=true deepgym serve \
--host 127.0.0.1 \
--port 8000 \
--reload \
--allow-local-exec
# production
DEEPGYM_API_KEY=your-key DAYTONA_API_KEY=your-key \
deepgym serve --port 8000| Flag | Required | Default |
|---|---|---|
--host |
No | 127.0.0.1 |
--port |
No | 8000 |
--reload |
No | false |
--allow-local-exec |
No | false |
--no-auth |
No | false |
See API Server for endpoint docs.
Web debugging UI.
deepgym web --host 127.0.0.1 --port 8080 --reload --allow-local-exec