Merge branch 'wesley/claude-skills' into 'main'

KamiD · KamiD · commit 4e0d20c91ff1 · 2026-05-13T08:03:42.000Z
feat: benchmark and devnet skills

See merge request github/xlayer-toolkit!3
diff --git a/.claude/skills/adventure/SKILL.md b/.claude/skills/adventure/SKILL.md
@@ -0,0 +1,132 @@
+---
+name: adventure
+description: Run the X Layer "adventure" benchmark tool (ERC20 / native / io / fib / create stress tests on an Optimism L2 using 20k accounts). Use when the user asks to run adventure, benchmark X Layer, or run erc20-bench / native-bench / io-bench / fib-bench / create-bench.
+---
+
+# Adventure — X Layer Benchmark Tool
+
+Adventure is a stress-testing tool for X Layer (Optimism L2). It uses ~20,000 accounts to drive concurrent ERC20 / native / io / fib / create workloads against a node and measure performance and stability.
+
+## 0. Adventure directory
+
+The adventure project directory defaults to `tools/adventure` under the current repository root (i.e. `<repo-root>/tools/adventure`). No configuration file is needed — resolve the repo root with `git rev-parse --show-toplevel` and use `$REPO_ROOT/tools/adventure` as the working directory for all commands below.
+
+## 1. Commands
+
+All commands must be executed from the adventure directory (`cd "$(git rev-parse --show-toplevel)/tools/adventure" && ...`).
+
+### ERC20 stress test
+
+```bash
+make erc20
+# equivalent to:
+adventure erc20-init 10000ETH -f ./testdata/config.json
+adventure erc20-bench -f ./testdata/config.json --contract 0xContractAddress
+```
+
+### Native token stress test
+
+```bash
+make native
+# equivalent to:
+adventure native-init 10000ETH -f ./testdata/config.json
+adventure native-bench -f ./testdata/config.json
+```
+
+### IO bench
+
+```bash
+make io
+# equivalent to:
+adventure simulator-init 10000ETH -f ./testdata/config.json
+adventure io-bench -f ./testdata/config.json
+```
+
+### Fib bench
+
+```bash
+make fib
+# equivalent to:
+adventure simulator-init 10000ETH -f ./testdata/config.json
+adventure fib-bench -f ./testdata/config.json
+```
+
+### Create bench
+
+```bash
+make create
+# equivalent to:
+adventure native-init 10000ETH -f ./testdata/config.json
+adventure create-bench -f ./testdata/config.json
+```
+
+### Preconditions to remind the user about
+
+- RPC node reachable (default `http://127.0.0.1:8123`).
+- `testdata/config.json` exists and is tuned for the run.
+- `testdata/accounts/accounts-20000.txt` (20,000 accounts) is present as default, but may change if account number change.
+- `senderPrivateKey` in `testdata/config.json` has enough balance to fund init.
+
+## 2. Configuration file (`testdata/config.json`)
+
+Editable parameters:
+
+```json
+{
+  "rpc": ["http://127.0.0.1:8123"],
+  "accounts": 20000,
+  "senderPrivateKey": "0x...",
+  "concurrency": 20,
+  "mempoolPauseThreshold": 50000,
+  "targetTPS": 0,
+  "maxBatchSize": 100,
+  "gasPriceGwei": 100,
+  "saveTxHashes": false,
+  "simulatorParams": {
+    "simulatorConfig": {
+      "load_accounts": 12,
+      "update_accounts": 5,
+      "create_accounts": 0,
+      "load_storage": 49,
+      "update_storage": 9,
+      "delete_storage": 0,
+      "create_storage": 2,
+      "fib": 100
+    }
+  }
+}
+```
+
+### Top-level fields
+
+- `rpc` — RPC endpoint URLs (list; transactions are distributed across them).
+- `accounts` — number of benchmark accounts used as both senders and receivers. On first run, `testdata/accounts/accounts-<N>.txt` is auto-generated with this many private keys and reused on subsequent runs.
+- `senderPrivateKey` — funds contract deployment and token distribution during `*-init` commands.
+- `concurrency` — number of concurrent senders.
+- `mempoolPauseThreshold` — pause sending when mempool exceeds this size.
+- `targetTPS` — target transactions per second; `0` means no rate limit.
+- `maxBatchSize` — maximum transactions per batch (default `100`).
+- `gasPriceGwei` — gas price in Gwei.
+- `saveTxHashes` — when true, writes tx hashes to `./txhashes.log`.
+
+### `simulatorParams.simulatorConfig` (used by `io-bench` / `fib-bench` / `create-bench`)
+
+Parameters passed to the on-chain simulator contract to shape each transaction's workload:
+
+- `load_accounts` — account loads per tx (reads of existing accounts).
+- `update_accounts` — account updates per tx.
+- `create_accounts` — new accounts created per tx.
+- `load_storage` — storage slot reads per tx.
+- `update_storage` — storage slot updates per tx.
+- `delete_storage` — storage slot deletions per tx.
+- `create_storage` — new storage slots written per tx.
+- `fib` — Fibonacci iteration count; controls CPU/compute load per tx (used by `fib-bench`).
+
+Treat `senderPrivateKey` as sensitive: never echo or commit it, mask when showing values from the config, and never send it off-machine.
+
+## 3. Behavior
+
+- If the user's request is ambiguous ("run adventure"), ask which workload (erc20 / native / io / fib / create) and whether they want the full `make <target>` flow or a specific step.
+- Before running any `make` target, `cd` into `<repo-root>/tools/adventure`.
+- Do not modify `testdata/config.json` unless the user asks for a specific change.
+- Stream output so the user can watch benchmark progress.
diff --git a/.claude/skills/devnet/SKILL.md b/.claude/skills/devnet/SKILL.md
@@ -0,0 +1,124 @@
+---
+name: devnet
+description: Bring up the X Layer devnet (OP Stack local test environment) end to end — verify repo paths in example.env, build images via devnet/init.sh, start services with ./0-all.sh, then confirm the sequencer is producing blocks on port 8123. Use when the user asks to start / run / deploy / bring up the devnet, or to restart / verify the local OP Stack environment.
+---
+
+# Devnet — X Layer local OP Stack bring-up
+
+This skill brings up the local OP Stack devnet in `<repo-root>/devnet`, confirms repo paths are wired up in `example.env`, builds images, starts services, and verifies the L2 sequencer is producing blocks.
+
+See `devnet/README.md` for full background. The flow here is the "first-time setup then run" path, not `make run`.
+
+## 0. Working directory
+
+All commands run from `<repo-root>/devnet`. Resolve the repo root with `git rev-parse --show-toplevel` and `cd "$(git rev-parse --show-toplevel)/devnet"` before each step (or run once at the start). Never edit `.env` directly, .env prepare by user from `example.env`.
+
+## 1. Check repo paths in `example.env`
+
+Before building, verify that the *local repo directories* required for local image builds are configured. If `SKIP_*_BUILD=true` for a component, its path is not needed; if `SKIP_*_BUILD=false` the corresponding `*_LOCAL_DIRECTORY` must be an absolute path to a cloned repo.
+
+Components and their variables in `example.env`:
+
+| Component | Skip flag | Local dir variable | Upstream repo |
+|-----------|-----------|---------------------|----------------|
+| OP Stack | `SKIP_OP_STACK_BUILD` | `OP_STACK_LOCAL_DIRECTORY` | https://github.com/okx/optimism |
+| OP Contracts | `SKIP_OP_CONTRACTS_BUILD` | (uses `OP_STACK_LOCAL_DIRECTORY`) | same as above |
+| OP Geth | `SKIP_OP_GETH_BUILD` | `OP_GETH_LOCAL_DIRECTORY` | https://github.com/okx/op-geth |
+| OP Reth | `SKIP_OP_RETH_BUILD` | `OP_RETH_LOCAL_DIRECTORY` | https://github.com/okx/reth |
+| OP Succinct | `SKIP_OP_SUCCINCT_BUILD` | `OP_SUCCINCT_LOCAL_DIRECTORY` | (optional, only if `PROOF_ENGINE=op-succinct`) |
+| Kailua | `SKIP_KAILUA_BUILD` | `KAILUA_LOCAL_DIRECTORY` | (optional, only if `PROOF_ENGINE=kailua`) |
+
+Recommended check (run from `<repo-root>/devnet`):
+
+```bash
+grep -E '^(SKIP_[A-Z_]+_BUILD|[A-Z_]+_LOCAL_DIRECTORY)=' example.env
+```
+
+For every `SKIP_X_BUILD=false`, confirm the matching `X_LOCAL_DIRECTORY` is a non-empty absolute path that exists on disk. If any required path is missing or empty, **stop and prompt the user** with a clear list of which variables need to be filled in, e.g.:
+
+> `OP_RETH_LOCAL_DIRECTORY` is empty but `SKIP_OP_RETH_BUILD=false`. Please set it to the absolute path of your local `okx/reth` clone in `example.env`, then re-run this skill.
+
+Do not attempt to guess paths or clone repositories automatically.
+
+Also confirm the sequencer / RPC mode combination matches the user's intent (e.g. `SEQ_TYPE=reth` / `RPC_TYPE=reth` vs `geth`). If unclear, ask.
+
+## 2. Build images — `./init.sh`
+
+Once `example.env` is correct:
+
+```bash
+cd "$(git rev-parse --show-toplevel)/devnet"
+./clean.sh        # sync example.env -> .env and stop any stale containers
+./init.sh         # build Docker images for all components with SKIP_*_BUILD=false
+```
+
+Notes:
+- `init.sh` can take a long time (tens of minutes) depending on which components are being built locally. Run in the foreground and stream output so the user can see progress.
+- If `init.sh` fails, surface the failing step (usually a Docker build) and stop — do not proceed to start services.
+- Re-run `init.sh` only when code changes require rebuilt images.
+
+## 3. Start services — `./0-all.sh`
+
+```bash
+cd "$(git rev-parse --show-toplevel)/devnet"
+./0-all.sh
+```
+
+This chains `1-start-l1.sh` → `2-deploy-op-contracts.sh` → `3-op-init.sh` → `4-op-start-service.sh` (and `5-*` / `6-*` if the corresponding proof engines are enabled). Expect it to take several minutes; watch for errors in contract deployment and op-geth/op-reth init.
+
+## 4. Verify block production on port 8123
+
+After `0-all.sh` exits successfully, confirm the L2 sequencer RPC is up and blocks are advancing. Port `8123` is the `op-geth-seq` / sequencer RPC (see the Service Ports table in `devnet/README.md`).
+
+### 4.1 Port is listening
+
+```bash
+# macOS
+lsof -iTCP:8123 -sTCP:LISTEN -n -P
+# or
+nc -z localhost 8123 && echo "8123 open" || echo "8123 closed"
+```
+
+If the port is not listening, inspect container state:
+
+```bash
+cd "$(git rev-parse --show-toplevel)/devnet"
+docker compose ps
+docker compose logs --tail=200 op-reth-seq
+```
+
+Report the failing container and stop — do not claim success.
+
+### 4.2 Block number is increasing
+
+Query `eth_blockNumber` several times with a short gap and confirm the value strictly increases. Use `curl` (no extra tooling required):
+
+```bash
+for i in 1 2 3 4 5; do
+  curl -s -X POST -H 'Content-Type: application/json' \
+    --data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
+    http://localhost:8123 \
+    | sed -E 's/.*"result":"(0x[0-9a-fA-F]+)".*/\1/'
+  sleep 2
+done
+```
+
+Pass criteria:
+- Every response is a valid `0x...` hex block number (not an error object).
+- The decimal value strictly increases across samples (at least one increment over ~10 seconds).
+
+If the value does not change, the sequencer is not producing blocks. Check `docker compose logs op-geth-seq` and `docker compose logs op-node` (or `op-seq`), report the symptom, and stop. Do not report the devnet as healthy.
+
+### 4.3 Reporting
+
+When all three checks pass (port listening, valid block numbers, strictly increasing), report concisely:
+
+> Devnet is up. Port 8123 listening. Block number advanced from `<n0>` to `<nN>` over `<seconds>`s.
+
+## 5. Things to avoid
+
+- Do not edit `.env` directly — always edit `example.env`, then `./clean.sh`.
+- Do not run `./init.sh` repeatedly "just in case" — it is slow and destructive to image caches.
+- Do not skip the block-number verification and assume success from port 8123 being open; a stuck sequencer will still hold the port.
+- Do not invent repo paths or auto-clone missing repositories — prompt the user instead.
+- Do not print or commit any private keys or funded test-account secrets from `devnet/` configs.
diff --git a/.claude/skills/sa-benchmark/.gitignore b/.claude/skills/sa-benchmark/.gitignore
@@ -0,0 +1 @@
+config.env
diff --git a/.claude/skills/sa-benchmark/SKILL.md b/.claude/skills/sa-benchmark/SKILL.md
@@ -0,0 +1,126 @@
+---
+name: sa-benchmark
+description: Run the SA-Benchmark (Smart Account / OKX Pay ERC-4337) stress test using the "Old Method" — `./1-setup.sh` (TypeScript deploy + clone/build polycli) followed by `./2-bench.sh` (polycli loadtest erc4337). Use when the user asks to run SA-Benchmark, benchmark OKX Pay / Smart Account / ERC-4337, or run polycli erc4337 loadtest. The SA-Benchmark project lives outside this repo; its absolute path is stored in a per-skill config file and can be changed at any time.
+---
+
+# SA-Benchmark — Smart Account / OKX Pay ERC-4337 loadtest
+
+This skill runs the **Old Method** from the SA-Benchmark project README: TypeScript contract deploy + initial UserOperation, followed by a Go `polycli loadtest erc4337` run. The SA-Benchmark repo is external to this toolkit, so its absolute path must be configured before the skill can run.
+
+See `<SA_BENCHMARK_DIR>/README.md` in the target repo for full background. The relevant scripts are `1-setup.sh` and `2-bench.sh` in that directory's root.
+
+## 0. Locate / configure the SA-Benchmark directory
+
+The SA-Benchmark directory is **not** inside this repo. The skill stores its absolute path in:
+
+```
+<repo-root>/.claude/skills/sa-benchmark/config.env
+```
+
+Contents (single line):
+
+```
+SA_BENCHMARK_DIR=/absolute/path/to/SA-Benchmark
+```
+
+### First-time initialization
+
+1. Resolve the config path: `CONFIG_FILE="$(git rev-parse --show-toplevel)/.claude/skills/sa-benchmark/config.env"`.
+2. If `CONFIG_FILE` does **not** exist, or exists but `SA_BENCHMARK_DIR` is empty / missing:
+   - **Stop and prompt the user**. Ask for the absolute path to their local SA-Benchmark clone (e.g. `/Users/oker/go/bin/code/ethereum/SA-Benchmark`). Do not guess, do not `find`, do not auto-clone.
+   - Once the user replies, validate that the path is absolute, exists, is a directory, and contains both `1-setup.sh` and `2-bench.sh` and an `example.env` file. If any check fails, report the exact problem and re-prompt.
+   - Write the config file with `SA_BENCHMARK_DIR=<path>` and confirm.
+3. If `CONFIG_FILE` exists and `SA_BENCHMARK_DIR` points at a valid directory, proceed.
+
+### Changing the directory later
+
+The user can change the path at any time by saying things like "change SA-Benchmark dir to X", "point SA-Benchmark at /new/path", or "reconfigure SA-Benchmark". When that happens:
+1. Validate the new path (same checks as first-time init).
+2. Overwrite `config.env` with the new `SA_BENCHMARK_DIR=...`.
+3. Confirm the updated value back to the user. Do **not** delete or clean up the previous directory.
+
+Never edit the SA-Benchmark project's `.env` or scripts to change its location — the path is a Claude-side config only.
+
+## 1. Prepare `.env` in the SA-Benchmark directory
+
+All remaining steps run with `cd "$SA_BENCHMARK_DIR"`.
+
+1. If `.env` does not exist, copy it from the template: `cp example.env .env`. Do not overwrite an existing `.env`.
+2. Verify the required fields for the Old Method are populated in `.env`:
+   - `PRIVATE_KEY` — funded L2 account (must not be empty / default)
+   - `LOCAL_RPC_URL` — L2 RPC, typically `http://127.0.0.1:8123` when running against the local devnet
+   - `CARD_ENABLE` — must be `false` (or unset). The Old Method is the ERC4337 + polycli path; `CARD_ENABLE=true` switches `1-setup.sh` / `2-bench.sh` to the Card Claim path, which is **not** what this skill runs.
+   - `POLYCLI_REPO`, `POLYCLI_BRANCH` — used by `1-setup.sh` to clone and build polycli
+   - `GAS_PRICE`, `CALLDATA_FILE`, `CALLDATA_SIZE`, `TOTAL_UOP`, `BATCH_SIZE`, `CONCURRENCY`, `RATE_LIMIT`, `CALLDATA_TYPE`, `VERIFIER_TYPE` — required by `1-setup.sh`'s pre-flight check; empty values will make the script exit.
+3. If any required field is missing, **stop and list them** for the user. Only print each variable's *name*, never its value — secrets like `PRIVATE_KEY` must not be echoed.
+
+Quick check (safe — prints only variable *names* whose values are empty):
+
+```bash
+cd "$SA_BENCHMARK_DIR"
+[ -f .env ] || cp example.env .env
+set -a; . ./.env; set +a
+for v in PRIVATE_KEY LOCAL_RPC_URL POLYCLI_REPO POLYCLI_BRANCH GAS_PRICE \
+         CALLDATA_FILE CALLDATA_SIZE TOTAL_UOP BATCH_SIZE CONCURRENCY \
+         RATE_LIMIT CALLDATA_TYPE VERIFIER_TYPE; do
+  [ -z "${!v}" ] && echo "MISSING: $v"
+done
+[ "${CARD_ENABLE:-false}" = "true" ] && echo "WARNING: CARD_ENABLE=true switches to the Card path, not the Old Method"
+```
+
+If the user wants the benchmark to target this toolkit's local devnet, `LOCAL_RPC_URL` should be `http://127.0.0.1:8123` and the devnet must already be up (see the `devnet` skill). Do not start the devnet from inside this skill — if the user wants both, suggest running the `devnet` skill first.
+
+## 2. Setup — `./1-setup.sh`
+
+```bash
+cd "$SA_BENCHMARK_DIR"
+./1-setup.sh
+```
+
+What it does (Old Method branch, i.e. `CARD_ENABLE=false`):
+- Clones or updates `POLYCLI_REPO` at `POLYCLI_BRANCH`, then runs `make install` so the `polycli` binary lands in `~/go/bin`.
+- `yarn` to install Node dependencies.
+- `yarn run deploy` to deploy / attach to the Smart Account + OKX Pay contracts.
+- `yarn run senduop:local` to send the initial UserOperation that counterfactually deploys sender accounts and funds bundlers.
+
+Notes:
+- The setup can take several minutes (Go build + Node install + on-chain deploys). Stream output; don't background it.
+- If `1-setup.sh` fails, surface the failing step and **stop** — do not proceed to `2-bench.sh`. Re-running setup after the user fixes the underlying issue is safe.
+- The script populates contract addresses (`ENTRYPOINT`, `ACCOUNT_FACTORY`, `PAY`, `TEST_ERC20`, `WEBAUTHN_VALIDATOR`, …) in `.env`; `2-bench.sh` reads them. If any of those are empty after setup, setup did not finish — investigate before benching.
+
+## 3. Benchmark — `./2-bench.sh`
+
+```bash
+cd "$SA_BENCHMARK_DIR"
+./2-bench.sh
+```
+
+Behavior (Old Method branch):
+- Prints the test parameters (TOTAL_UOP, BATCH_SIZE, CONCURRENCY, RATE_LIMIT, CALLDATA_TYPE).
+- Writes raw output to `result_<YYYYMMDD_HHMM>.out` in the SA-Benchmark directory.
+- Runs `polycli loadtest erc4337` with the contract addresses set up in step 2.
+- At the end, strips ANSI escapes, greps the last `tps=...` line, and prints `Final TPS: <value>`.
+
+Heads-up to the user (optional, only mention if the benchmark is expected to run long):
+- Perf profile capture: `curl http://localhost:6060/debug/pprof/profile?seconds=120 > prof_<timestamp>.bin`
+- Sequencer log monitor: `docker logs op-reth-seq --tail 10 -f 2>&1 | grep TotalDuration-batch`
+
+## 4. Reporting
+
+After `2-bench.sh` exits:
+1. Identify the newest `result_*.out` in `$SA_BENCHMARK_DIR` (the one the script just wrote).
+2. Extract the final TPS. Prefer the line `TPS: <value>` appended at the end of the file; fall back to the last `tps=<value>` match.
+3. Report concisely to the user, e.g.:
+   > SA-Benchmark finished. Final TPS: `<value>`. Full log: `<SA_BENCHMARK_DIR>/result_<timestamp>.out`.
+
+If `Final TPS` is missing or empty, the loadtest did not complete cleanly — do not claim success. Summarize the tail of the result file (last ~20 lines) and any error lines you can see.
+
+## 5. Things to avoid
+
+- **Never** print, log, or commit `PRIVATE_KEY`, derived wallets, bundler private keys, or any value from `.env`. Reference fields by name only.
+- Never clone the SA-Benchmark repo automatically. If the configured path is missing, prompt the user.
+- Never run `./loadtest.sh` or `yarn deploy` directly from this skill — that is the "New Method" and is explicitly out of scope here.
+- Never set `CARD_ENABLE=true` or run the card path; that is a different benchmark and not what this skill targets.
+- Never run `2-bench.sh` before `1-setup.sh` succeeds; the contract addresses it reads from `.env` won't be populated.
+- Do not delete or truncate previous `result_*.out` / `result_*.out.bak` files unless the user asks.
+- Do not edit the SA-Benchmark project's files to hard-code paths or secrets — the directory path lives only in this skill's `config.env`.