Skip to content
Yiwei Mao edited this page Jul 14, 2025 · 10 revisions

Refer to the Docker setup guide for instructions on installing Docker on your machine

Evaluation Quick Start

Create a new empty folder, add two files in this folder:

./config.json5
./docker-compose.yml

For config.json5, copy the json below and edit by Config Parameters:

{
  "models": ["claude-3-5-sonnet-20241022", "openai/gpt-4o"],
}

For docker-compose.yml, copy the yaml below and set environment

services:
  web-bench:
    image: maoyiweiebay777/web-bench:latest
    volumes:
      # touch ./config.json5
      - ./config.json5:/app/apps/eval/src/config.json5
      - ./report:/app/apps/eval/report
    environment:
      # Add enviorment variables according to apps/src/model.json
      - OPENROUTER_API_KEY=your_api_key

Run docker-compose:

docker compose up

Evaluation Report will be generated under ./report/

Build

docker build -f ./start.dockerfile -t web-bench .

Usage

docker run -v $(pwd)/apps/eval/src/config.json5:/app/apps/eval/src/config.json5 -t web-bench

Note

The current mode only supports evaluation, not development.

Clone this wiki locally