|
| 1 | +--- |
| 2 | +name: sa-benchmark |
| 3 | +description: Run the SA-Benchmark (Smart Account / OKX Pay ERC-4337) stress test using the "Old Method" — `./1-setup.sh` (TypeScript deploy + clone/build polycli) followed by `./2-bench.sh` (polycli loadtest erc4337). Use when the user asks to run SA-Benchmark, benchmark OKX Pay / Smart Account / ERC-4337, or run polycli erc4337 loadtest. The SA-Benchmark project lives outside this repo; its absolute path is stored in a per-skill config file and can be changed at any time. |
| 4 | +--- |
| 5 | + |
| 6 | +# SA-Benchmark — Smart Account / OKX Pay ERC-4337 loadtest |
| 7 | + |
| 8 | +This skill runs the **Old Method** from the SA-Benchmark project README: TypeScript contract deploy + initial UserOperation, followed by a Go `polycli loadtest erc4337` run. The SA-Benchmark repo is external to this toolkit, so its absolute path must be configured before the skill can run. |
| 9 | + |
| 10 | +See `<SA_BENCHMARK_DIR>/README.md` in the target repo for full background. The relevant scripts are `1-setup.sh` and `2-bench.sh` in that directory's root. |
| 11 | + |
| 12 | +## 0. Locate / configure the SA-Benchmark directory |
| 13 | + |
| 14 | +The SA-Benchmark directory is **not** inside this repo. The skill stores its absolute path in: |
| 15 | + |
| 16 | +``` |
| 17 | +<repo-root>/.claude/skills/sa-benchmark/config.env |
| 18 | +``` |
| 19 | + |
| 20 | +Contents (single line): |
| 21 | + |
| 22 | +``` |
| 23 | +SA_BENCHMARK_DIR=/absolute/path/to/SA-Benchmark |
| 24 | +``` |
| 25 | + |
| 26 | +### First-time initialization |
| 27 | + |
| 28 | +1. Resolve the config path: `CONFIG_FILE="$(git rev-parse --show-toplevel)/.claude/skills/sa-benchmark/config.env"`. |
| 29 | +2. If `CONFIG_FILE` does **not** exist, or exists but `SA_BENCHMARK_DIR` is empty / missing: |
| 30 | + - **Stop and prompt the user**. Ask for the absolute path to their local SA-Benchmark clone (e.g. `/Users/oker/go/bin/code/ethereum/SA-Benchmark`). Do not guess, do not `find`, do not auto-clone. |
| 31 | + - Once the user replies, validate that the path is absolute, exists, is a directory, and contains both `1-setup.sh` and `2-bench.sh` and an `example.env` file. If any check fails, report the exact problem and re-prompt. |
| 32 | + - Write the config file with `SA_BENCHMARK_DIR=<path>` and confirm. |
| 33 | +3. If `CONFIG_FILE` exists and `SA_BENCHMARK_DIR` points at a valid directory, proceed. |
| 34 | + |
| 35 | +### Changing the directory later |
| 36 | + |
| 37 | +The user can change the path at any time by saying things like "change SA-Benchmark dir to X", "point SA-Benchmark at /new/path", or "reconfigure SA-Benchmark". When that happens: |
| 38 | +1. Validate the new path (same checks as first-time init). |
| 39 | +2. Overwrite `config.env` with the new `SA_BENCHMARK_DIR=...`. |
| 40 | +3. Confirm the updated value back to the user. Do **not** delete or clean up the previous directory. |
| 41 | + |
| 42 | +Never edit the SA-Benchmark project's `.env` or scripts to change its location — the path is a Claude-side config only. |
| 43 | + |
| 44 | +## 1. Prepare `.env` in the SA-Benchmark directory |
| 45 | + |
| 46 | +All remaining steps run with `cd "$SA_BENCHMARK_DIR"`. |
| 47 | + |
| 48 | +1. If `.env` does not exist, copy it from the template: `cp example.env .env`. Do not overwrite an existing `.env`. |
| 49 | +2. Verify the required fields for the Old Method are populated in `.env`: |
| 50 | + - `PRIVATE_KEY` — funded L2 account (must not be empty / default) |
| 51 | + - `LOCAL_RPC_URL` — L2 RPC, typically `http://127.0.0.1:8123` when running against the local devnet |
| 52 | + - `CARD_ENABLE` — must be `false` (or unset). The Old Method is the ERC4337 + polycli path; `CARD_ENABLE=true` switches `1-setup.sh` / `2-bench.sh` to the Card Claim path, which is **not** what this skill runs. |
| 53 | + - `POLYCLI_REPO`, `POLYCLI_BRANCH` — used by `1-setup.sh` to clone and build polycli |
| 54 | + - `GAS_PRICE`, `CALLDATA_FILE`, `CALLDATA_SIZE`, `TOTAL_UOP`, `BATCH_SIZE`, `CONCURRENCY`, `RATE_LIMIT`, `CALLDATA_TYPE`, `VERIFIER_TYPE` — required by `1-setup.sh`'s pre-flight check; empty values will make the script exit. |
| 55 | +3. If any required field is missing, **stop and list them** for the user. Only print each variable's *name*, never its value — secrets like `PRIVATE_KEY` must not be echoed. |
| 56 | + |
| 57 | +Quick check (safe — prints only variable *names* whose values are empty): |
| 58 | + |
| 59 | +```bash |
| 60 | +cd "$SA_BENCHMARK_DIR" |
| 61 | +[ -f .env ] || cp example.env .env |
| 62 | +set -a; . ./.env; set +a |
| 63 | +for v in PRIVATE_KEY LOCAL_RPC_URL POLYCLI_REPO POLYCLI_BRANCH GAS_PRICE \ |
| 64 | + CALLDATA_FILE CALLDATA_SIZE TOTAL_UOP BATCH_SIZE CONCURRENCY \ |
| 65 | + RATE_LIMIT CALLDATA_TYPE VERIFIER_TYPE; do |
| 66 | + [ -z "${!v}" ] && echo "MISSING: $v" |
| 67 | +done |
| 68 | +[ "${CARD_ENABLE:-false}" = "true" ] && echo "WARNING: CARD_ENABLE=true switches to the Card path, not the Old Method" |
| 69 | +``` |
| 70 | + |
| 71 | +If the user wants the benchmark to target this toolkit's local devnet, `LOCAL_RPC_URL` should be `http://127.0.0.1:8123` and the devnet must already be up (see the `devnet` skill). Do not start the devnet from inside this skill — if the user wants both, suggest running the `devnet` skill first. |
| 72 | + |
| 73 | +## 2. Setup — `./1-setup.sh` |
| 74 | + |
| 75 | +```bash |
| 76 | +cd "$SA_BENCHMARK_DIR" |
| 77 | +./1-setup.sh |
| 78 | +``` |
| 79 | + |
| 80 | +What it does (Old Method branch, i.e. `CARD_ENABLE=false`): |
| 81 | +- Clones or updates `POLYCLI_REPO` at `POLYCLI_BRANCH`, then runs `make install` so the `polycli` binary lands in `~/go/bin`. |
| 82 | +- `yarn` to install Node dependencies. |
| 83 | +- `yarn run deploy` to deploy / attach to the Smart Account + OKX Pay contracts. |
| 84 | +- `yarn run senduop:local` to send the initial UserOperation that counterfactually deploys sender accounts and funds bundlers. |
| 85 | + |
| 86 | +Notes: |
| 87 | +- The setup can take several minutes (Go build + Node install + on-chain deploys). Stream output; don't background it. |
| 88 | +- If `1-setup.sh` fails, surface the failing step and **stop** — do not proceed to `2-bench.sh`. Re-running setup after the user fixes the underlying issue is safe. |
| 89 | +- The script populates contract addresses (`ENTRYPOINT`, `ACCOUNT_FACTORY`, `PAY`, `TEST_ERC20`, `WEBAUTHN_VALIDATOR`, …) in `.env`; `2-bench.sh` reads them. If any of those are empty after setup, setup did not finish — investigate before benching. |
| 90 | + |
| 91 | +## 3. Benchmark — `./2-bench.sh` |
| 92 | + |
| 93 | +```bash |
| 94 | +cd "$SA_BENCHMARK_DIR" |
| 95 | +./2-bench.sh |
| 96 | +``` |
| 97 | + |
| 98 | +Behavior (Old Method branch): |
| 99 | +- Prints the test parameters (TOTAL_UOP, BATCH_SIZE, CONCURRENCY, RATE_LIMIT, CALLDATA_TYPE). |
| 100 | +- Writes raw output to `result_<YYYYMMDD_HHMM>.out` in the SA-Benchmark directory. |
| 101 | +- Runs `polycli loadtest erc4337` with the contract addresses set up in step 2. |
| 102 | +- At the end, strips ANSI escapes, greps the last `tps=...` line, and prints `Final TPS: <value>`. |
| 103 | + |
| 104 | +Heads-up to the user (optional, only mention if the benchmark is expected to run long): |
| 105 | +- Perf profile capture: `curl http://localhost:6060/debug/pprof/profile?seconds=120 > prof_<timestamp>.bin` |
| 106 | +- Sequencer log monitor: `docker logs op-reth-seq --tail 10 -f 2>&1 | grep TotalDuration-batch` |
| 107 | + |
| 108 | +## 4. Reporting |
| 109 | + |
| 110 | +After `2-bench.sh` exits: |
| 111 | +1. Identify the newest `result_*.out` in `$SA_BENCHMARK_DIR` (the one the script just wrote). |
| 112 | +2. Extract the final TPS. Prefer the line `TPS: <value>` appended at the end of the file; fall back to the last `tps=<value>` match. |
| 113 | +3. Report concisely to the user, e.g.: |
| 114 | + > SA-Benchmark finished. Final TPS: `<value>`. Full log: `<SA_BENCHMARK_DIR>/result_<timestamp>.out`. |
| 115 | +
|
| 116 | +If `Final TPS` is missing or empty, the loadtest did not complete cleanly — do not claim success. Summarize the tail of the result file (last ~20 lines) and any error lines you can see. |
| 117 | + |
| 118 | +## 5. Things to avoid |
| 119 | + |
| 120 | +- **Never** print, log, or commit `PRIVATE_KEY`, derived wallets, bundler private keys, or any value from `.env`. Reference fields by name only. |
| 121 | +- Never clone the SA-Benchmark repo automatically. If the configured path is missing, prompt the user. |
| 122 | +- Never run `./loadtest.sh` or `yarn deploy` directly from this skill — that is the "New Method" and is explicitly out of scope here. |
| 123 | +- Never set `CARD_ENABLE=true` or run the card path; that is a different benchmark and not what this skill targets. |
| 124 | +- Never run `2-bench.sh` before `1-setup.sh` succeeds; the contract addresses it reads from `.env` won't be populated. |
| 125 | +- Do not delete or truncate previous `result_*.out` / `result_*.out.bak` files unless the user asks. |
| 126 | +- Do not edit the SA-Benchmark project's files to hard-code paths or secrets — the directory path lives only in this skill's `config.env`. |
0 commit comments