Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
7110164
init files
bradleyshep Oct 24, 2025
ef61f28
Update llm-benchmark-details.json
bradleyshep Oct 24, 2025
a200254
llm benchmarks (moved from private)
bradleyshep Oct 24, 2025
58a5a08
remove dotenvy
bradleyshep Oct 24, 2025
af45a20
ignore registry
bradleyshep Oct 24, 2025
961dd1c
summary updates; command
bradleyshep Oct 25, 2025
8e6624f
Merge branch 'LLM-benchmarks' into bradley/llm-benchmark
bradleyshep Oct 25, 2025
b59650f
develop updates
bradleyshep Oct 25, 2025
38596ba
DEVELOP + registry ignored
bradleyshep Oct 25, 2025
7d69779
change generated registry to use relative paths + include in git
bradleyshep Oct 25, 2025
3aa051b
attempt fix to pass
bradleyshep Oct 25, 2025
e443251
DEVELOP updates; clippy fixes?
bradleyshep Oct 25, 2025
8161e45
clippy fixes
bradleyshep Nov 3, 2025
26de9c4
Update ci.yml
bradleyshep Nov 3, 2025
79d4abe
Potential fix for code scanning alert no. 106: Workflow does not cont…
bradleyshep Nov 3, 2025
0f606b0
Merge branch 'master' into bradley/llm-benchmark
bradleyshep Nov 3, 2025
edeefb1
bump to 1.6, fixes
bradleyshep Nov 3, 2025
3a0c2de
partial category scores
bradleyshep Nov 3, 2025
4466fa0
Update DEVELOP.md
bradleyshep Nov 4, 2025
eb46333
Remove diff; add ci-quickfix
bradleyshep Nov 7, 2025
1534e75
Merge branch 'master' into bradley/llm-benchmark
bradleyshep Nov 7, 2025
fd1933e
Merge branch 'master' into bradley/llm-benchmark
bradleyshep Dec 4, 2025
bc5e5bc
Fixes whitespace
cloutiertyler Dec 30, 2025
7d811b1
Merges in master
cloutiertyler Dec 30, 2025
28592d9
Switched to use clap for arg parsing
cloutiertyler Dec 31, 2025
538f77f
Refactored to use SpacetimeDBGuard
cloutiertyler Dec 31, 2025
798c134
Removed unused import
cloutiertyler Dec 31, 2025
9c2736a
Merge branch 'master' into bradley/llm-benchmark
cloutiertyler Jan 2, 2026
a38d746
Moved the llm benchmark into tools instead of crates
cloutiertyler Jan 2, 2026
f49495b
prelim llm benchmark update workflow
cloutiertyler Jan 3, 2026
b395378
Updated the llm benchmark workflow
cloutiertyler Jan 3, 2026
b9153a9
Added OpenAI API key
cloutiertyler Jan 3, 2026
8db7782
Update to I can run it from this branch
cloutiertyler Jan 3, 2026
8493a52
Potential fix for code scanning alert no. 129: Untrusted Checkout TOCTOU
cloutiertyler Jan 3, 2026
15ee6c3
run the workflow plz
cloutiertyler Jan 3, 2026
2b6804d
updated CI name
cloutiertyler Jan 3, 2026
d4618a1
fixed skip check
cloutiertyler Jan 3, 2026
68a1d7d
Manually putting in the PR number
cloutiertyler Jan 3, 2026
50e54e5
Hopefully?
cloutiertyler Jan 3, 2026
e33da58
removed push
cloutiertyler Jan 3, 2026
c47d6c1
Fix thing
cloutiertyler Jan 3, 2026
0bf66e2
Fix
cloutiertyler Jan 3, 2026
03f398e
Fix thing
cloutiertyler Jan 3, 2026
280ac58
Fix thing
cloutiertyler Jan 3, 2026
0ba377d
Fix thing
cloutiertyler Jan 3, 2026
82ea3a9
Install spacetime
cloutiertyler Jan 3, 2026
6cb3c62
Install spacetime
cloutiertyler Jan 3, 2026
71b7ca4
Added important comments
cloutiertyler Jan 3, 2026
133851e
Refactor llm-benchmark to pass host_url through the app
cloutiertyler Jan 4, 2026
cbe5ee0
Update LLM benchmark results
Jan 4, 2026
271ea9e
Cargo fmt and ci fix
cloutiertyler Jan 4, 2026
d421301
Update LLM benchmark results
Jan 4, 2026
6f64817
Fixed version
cloutiertyler Jan 4, 2026
34b0d43
Update LLM benchmark results
Jan 4, 2026
9dffd30
Add PR comment with benchmark results table
cloutiertyler Jan 4, 2026
a79f785
Restructure LLM benchmark result files for deterministic output and c…
cloutiertyler Jan 4, 2026
fa0b0d7
Cargo fmt
cloutiertyler Jan 4, 2026
819e8ca
Cargo clippy
cloutiertyler Jan 4, 2026
b60c28a
Try to fix failure
cloutiertyler Jan 4, 2026
8d916b8
Made long running jobs dependent on short running basic checks
cloutiertyler Jan 4, 2026
25e79c5
Forgot to save file
cloutiertyler Jan 4, 2026
861c804
Consolidated internal tests into the CI workflow
cloutiertyler Jan 4, 2026
a0817d5
slight name change
cloutiertyler Jan 4, 2026
0d13679
Small fix. Lints now needed to run c sharp test suite
cloutiertyler Jan 4, 2026
bab6938
Fix C# benchmark SIGSEGV crashes in CI
cloutiertyler Jan 5, 2026
e278696
cargo fmt
cloutiertyler Jan 5, 2026
1095c95
Try to fix C# problems
cloutiertyler Jan 5, 2026
320c46c
Add MSBuild env vars to fix "Pipe is broken" errors in CI
cloutiertyler Jan 5, 2026
916fafb
Removed errant file
cloutiertyler Jan 5, 2026
1faf580
Hopefully fix thing
cloutiertyler Jan 5, 2026
f570c3b
Added nix flake check
cloutiertyler Jan 6, 2026
b36fc11
Update LLM benchmark results
Jan 6, 2026
04eb91a
Fixed workflow
cloutiertyler Jan 6, 2026
c1eb855
Removed nix flake check, see #3955
jdetter Jan 6, 2026
cb83f9f
Update workflow to checkout from master branch
cloutiertyler Jan 6, 2026
7c4c5df
Merge branch 'master' into bradley/llm-benchmark
cloutiertyler Jan 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ rustflags = ["--cfg", "tokio_unstable"]

[alias]
bump-versions = "run -p upgrade-version --"
llm = "run --package xtask-llm-benchmark --bin llm_benchmark --"
ci = "run -p ci --"

[target.x86_64-pc-windows-msvc]
Expand Down
154 changes: 149 additions & 5 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@
workflow_dispatch:
inputs:
pr_number:
description: 'Pull Request Number'
description: "Pull Request Number"
required: false
default: ''
default: ""

name: CI

Expand All @@ -19,164 +19,166 @@

jobs:
docker_smoketests:
needs: [lints, llm_ci_check]
name: Smoketests
strategy:
matrix:
runner: [spacetimedb-new-runner, windows-latest]
include:
- runner: spacetimedb-new-runner
smoketest_args: --docker
container:
image: localhost:5000/spacetimedb-ci:latest
options: --privileged
- runner: windows-latest
smoketest_args: --no-build-cli
container: null
runs-on: ${{ matrix.runner }}
container: ${{ matrix.container }}
timeout-minutes: 120
env:
CARGO_TARGET_DIR: ${{ github.workspace }}/target
steps:
- name: Find Git ref
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
shell: bash
run: |
PR_NUMBER="${{ github.event.inputs.pr_number || null }}"
if test -n "${PR_NUMBER}"; then
GIT_REF="$( gh pr view --repo clockworklabs/SpacetimeDB $PR_NUMBER --json headRefName --jq .headRefName )"
else
GIT_REF="${{ github.ref }}"
fi
echo "GIT_REF=${GIT_REF}" >>"$GITHUB_ENV"
- name: Checkout sources
uses: actions/checkout@v4
with:
ref: ${{ env.GIT_REF }}
- uses: dsherret/rust-toolchain-file@v1
- name: Cache Rust dependencies
uses: Swatinem/rust-cache@v2
with:
workspaces: ${{ github.workspace }}
shared-key: spacetimedb
cache-on-failure: false
cache-all-crates: true
cache-workspace-crates: true
prefix-key: v1

- uses: actions/setup-dotnet@v4
with:
global-json-file: global.json

# nodejs and pnpm are required for the typescript quickstart smoketest
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: 18

- uses: pnpm/action-setup@v4
with:
run_install: true

- name: Install psql (Windows)
if: runner.os == 'Windows'
run: choco install psql -y --no-progress
shell: powershell
- name: Build crates
run: cargo build -p spacetimedb-cli -p spacetimedb-standalone -p spacetimedb-update
- name: Start Docker daemon
if: runner.os == 'Linux'
run: /usr/local/bin/start-docker.sh

- name: Build and start database (Linux)
if: runner.os == 'Linux'
run: |
# Our .dockerignore omits `target`, which our CI Dockerfile needs.
rm .dockerignore
docker compose -f .github/docker-compose.yml up -d
- name: Build and start database (Windows)
if: runner.os == 'Windows'
run: |
# Fail properly if any individual command fails
$ErrorActionPreference = 'Stop'
$PSNativeCommandUseErrorActionPreference = $true

Start-Process target/debug/spacetimedb-cli.exe -ArgumentList 'start --pg-port 5432'
cd modules
# the sdk-manifests on windows-latest are messed up, so we need to update them
dotnet workload config --update-mode manifests
dotnet workload update
- uses: actions/setup-python@v5
with: { python-version: '3.12' }
with: { python-version: "3.12" }
if: runner.os == 'Windows'
- name: Install python deps
run: python -m pip install -r smoketests/requirements.txt
- name: Run smoketests
# Note: clear_database and replication only work in private
run: cargo ci smoketests -- ${{ matrix.smoketest_args }} -x clear_database replication teams
- name: Stop containers (Linux)
if: always() && runner.os == 'Linux'
run: docker compose -f .github/docker-compose.yml down

test:

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
needs: [lints, llm_ci_check]
name: Test Suite
runs-on: spacetimedb-new-runner
container:
image: localhost:5000/spacetimedb-ci:latest
options: >-
--privileged
env:
CARGO_TARGET_DIR: ${{ github.workspace }}/target
steps:
- name: Find Git ref
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
PR_NUMBER="${{ github.event.inputs.pr_number || null }}"
if test -n "${PR_NUMBER}"; then
GIT_REF="$( gh pr view --repo clockworklabs/SpacetimeDB $PR_NUMBER --json headRefName --jq .headRefName )"
else
GIT_REF="${{ github.ref }}"
fi
echo "GIT_REF=${GIT_REF}" >>"$GITHUB_ENV"

- name: Checkout sources
uses: actions/checkout@v4
with:
ref: ${{ env.GIT_REF }}

- uses: dsherret/rust-toolchain-file@v1
- name: Cache Rust dependencies
uses: Swatinem/rust-cache@v2
with:
workspaces: ${{ github.workspace }}
shared-key: spacetimedb
# Let the smoketests job save the cache since it builds the most things
save-if: false
prefix-key: v1

- uses: actions/setup-dotnet@v3
with:
global-json-file: global.json

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: 18

- uses: pnpm/action-setup@v4
with:
run_install: true

- name: Build typescript module sdk
working-directory: crates/bindings-typescript
run: pnpm build

- name: Run tests
run: cargo ci test

lints:

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
name: Lints
runs-on: spacetimedb-new-runner
container:
Expand Down Expand Up @@ -500,7 +502,22 @@
run: |
cargo ci cli-docs

llm_ci_check:
name: Verify LLM benchmark is up to date
permissions:
contents: read
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2

- name: Run hash check (both langs)
run: cargo llm ci-check

unity-testsuite:
needs: [lints, llm_ci_check]
# Skip if this is an external contribution.
# The license secrets will be empty, so the step would fail anyway.
if: ${{ github.event_name != 'pull_request' || !github.event.pull_request.head.repo.fork }}
Expand Down Expand Up @@ -585,7 +602,7 @@
enable_pr_comment: ${{ github.event_name == 'pull_request' }}
target_path: sdks/csharp
env:
GITHUB_TOKEN: '${{ secrets.GITHUB_TOKEN }}'
GITHUB_TOKEN: "${{ secrets.GITHUB_TOKEN }}"

- name: Start SpacetimeDB
run: |
Expand Down Expand Up @@ -624,116 +641,243 @@
githubToken: ${{ secrets.GITHUB_TOKEN }}
testMode: playmode
useHostNetwork: true
artifactsPath: ''
artifactsPath: ""
env:
UNITY_EMAIL: ${{ secrets.UNITY_EMAIL }}
UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
UNITY_SERIAL: ${{ secrets.UNITY_SERIAL }}

csharp-testsuite:
needs: [lints, llm_ci_check]
runs-on: spacetimedb-new-runner
container:
image: localhost:5000/spacetimedb-ci:latest
options: >-
--privileged
--cgroupns=host
timeout-minutes: 30
env:
CARGO_TARGET_DIR: ${{ github.workspace }}/target
steps:
- name: Checkout repository
id: checkout-stdb
uses: actions/checkout@v4

# Run cheap .NET tests first. If those fail, no need to run expensive Unity tests.

- name: Setup dotnet
uses: actions/setup-dotnet@v3
with:
global-json-file: global.json

- name: Override NuGet packages
run: |
dotnet pack crates/bindings-csharp/BSATN.Runtime
dotnet pack crates/bindings-csharp/Runtime

# Write out the nuget config file to `nuget.config`. This causes the spacetimedb-csharp-sdk repository
# to be aware of the local versions of the `bindings-csharp` packages in SpacetimeDB, and use them if
# available. Otherwise, `spacetimedb-csharp-sdk` will use the NuGet versions of the packages.
# This means that (if version numbers match) we will test the local versions of the C# packages, even
# if they're not pushed to NuGet.
# See https://learn.microsoft.com/en-us/nuget/reference/nuget-config-file for more info on the config file.
cd sdks/csharp
./tools~/write-nuget-config.sh ../..

- name: Run .NET tests
working-directory: sdks/csharp
run: dotnet test -warnaserror

- name: Verify C# formatting
working-directory: sdks/csharp
run: dotnet format --no-restore --verify-no-changes SpacetimeDB.ClientSDK.sln

- name: Install Rust toolchain
uses: dsherret/rust-toolchain-file@v1

- name: Cache Rust dependencies
uses: Swatinem/rust-cache@v2
with:
workspaces: ${{ github.workspace }}
shared-key: spacetimedb
# Let the main CI job save the cache since it builds the most things
save-if: false
prefix-key: v1

- name: Install SpacetimeDB CLI from the local checkout
run: |
cargo install --force --path crates/cli --locked --message-format=short
cargo install --force --path crates/standalone --locked --message-format=short
# Add a handy alias using the old binary name, so that we don't have to rewrite all scripts (incl. in submodules).
ln -sf $CARGO_HOME/bin/spacetimedb-cli $CARGO_HOME/bin/spacetime

# This step shouldn't be needed, but somehow we end up with caches that are missing librusty_v8.a.
# ChatGPT suspects that this could be due to different build invocations using the same target dir,
# and this makes sense to me because we only see it in this job where we mix `cargo build -p` with
# `cargo build --manifest-path` (which apparently build different dependency trees).
# However, we've been unable to fix it so... /shrug
- name: Check v8 outputs
run: |
find "${CARGO_TARGET_DIR}"/ -type f | grep '[/_]v8' || true
if ! [ -f "${CARGO_TARGET_DIR}"/debug/gn_out/obj/librusty_v8.a ]; then
echo "Could not find v8 output file librusty_v8.a; rebuilding manually."
cargo clean -p v8 || true
cargo build -p v8
fi

- name: Check quickstart-chat bindings are up to date
working-directory: sdks/csharp
run: |
bash tools~/gen-quickstart.sh
"${GITHUB_WORKSPACE}"/tools/check-diff.sh examples~/quickstart-chat || {
echo 'Error: quickstart-chat bindings have changed. Please run `sdks/csharp/tools~/gen-quickstart.sh`.'
exit 1
}

- name: Check client-api bindings are up to date
working-directory: sdks/csharp
run: |
bash tools~/gen-client-api.sh
"${GITHUB_WORKSPACE}"/tools/check-diff.sh src/SpacetimeDB/ClientApi || {
echo 'Error: Client API bindings are dirty. Please run `sdks/csharp/tools~/gen-client-api.sh`.'
exit 1
}

- name: Start SpacetimeDB
run: |
spacetime start &
disown

- name: Run regression tests
run: |
bash sdks/csharp/tools~/run-regression-tests.sh
tools/check-diff.sh sdks/csharp/examples~/regression-tests || {
echo 'Error: Bindings are dirty. Please run `sdks/csharp/tools~/gen-regression-tests.sh`.'
exit 1
}

internal-tests:
name: Internal Tests
needs: [lints, llm_ci_check]
# Skip if not a PR or a push to master
# Skip if this is an external contribution. GitHub secrets will be empty, so the step would fail anyway.
if: ${{ (github.event_name == 'pull_request' || (github.event_name == 'push' && github.ref == 'refs/heads/master'))
&& (github.event_name != 'pull_request' || !github.event.pull_request.head.repo.fork) }}
permissions:
contents: read
runs-on: ubuntu-latest
env:
TARGET_OWNER: clockworklabs
TARGET_REPO: SpacetimeDBPrivate
steps:
- id: dispatch
name: Trigger tests
uses: actions/github-script@v7
with:
github-token: ${{ secrets.SPACETIMEDB_PRIVATE_TOKEN }}
script: |
const workflowId = 'ci.yml';
const targetRef = 'master';
const targetOwner = process.env.TARGET_OWNER;
const targetRepo = process.env.TARGET_REPO;
// Use the ref for pull requests because the head sha is brittle (github does some extra dance where it merges in master).
const publicRef = (context.eventName === 'pull_request') ? context.payload.pull_request.head.ref : context.sha;
const preDispatch = new Date().toISOString();

// Dispatch the workflow in the target repository
await github.rest.actions.createWorkflowDispatch({
owner: targetOwner,
repo: targetRepo,
workflow_id: workflowId,
ref: targetRef,
inputs: { public_ref: publicRef }
});

const sleep = (ms) => new Promise(r => setTimeout(r, ms));

// Find the dispatched run by name
let runId = null;
for (let attempt = 0; attempt < 20 && !runId; attempt++) { // up to ~10 minutes to locate the run
await sleep(5000);
const runsResp = await github.rest.actions.listWorkflowRuns({
owner: targetOwner,
repo: targetRepo,
workflow_id: workflowId,
event: 'workflow_dispatch',
branch: targetRef,
per_page: 50,
});

const expectedName = `CI [public_ref=${publicRef}]`;
const candidates = runsResp.data.workflow_runs
.filter(r => r.name === expectedName && new Date(r.created_at) >= new Date(preDispatch))
.sort((a, b) => new Date(b.created_at) - new Date(a.created_at));

if (candidates.length > 0) {
runId = candidates[0].id;
break;
}
}

if (!runId) {
core.setFailed('Failed to locate dispatched run in the private repository.');
return;
}

const runUrl = `https://github.com/${targetOwner}/${targetRepo}/actions/runs/${runId}`;
core.info(`View run: ${runUrl}`);
core.setOutput('run_id', String(runId));
core.setOutput('run_url', runUrl);

- name: Wait for Internal Tests to complete
uses: actions/github-script@v7
with:
github-token: ${{ secrets.SPACETIMEDB_PRIVATE_TOKEN }}
script: |
const targetOwner = process.env.TARGET_OWNER;
const targetRepo = process.env.TARGET_REPO;
const runId = Number(`${{ steps.dispatch.outputs.run_id }}`);
const runUrl = `${{ steps.dispatch.outputs.run_url }}`;
const sleep = (ms) => new Promise(r => setTimeout(r, ms));

core.info(`Waiting for workflow result... ${runUrl}`);

let conclusion = null;
for (let attempt = 0; attempt < 240; attempt++) { // up to ~2 hours
const runResp = await github.rest.actions.getWorkflowRun({
owner: targetOwner,
repo: targetRepo,
run_id: runId
});
const { status, conclusion: c } = runResp.data;
if (status === 'completed') {
conclusion = c || 'success';
break;
}
await sleep(30000);
}

if (!conclusion) {
core.setFailed('Timed out waiting for private workflow to complete.');
return;
}

if (conclusion !== 'success') {
core.setFailed(`Private workflow failed with conclusion: ${conclusion}`);
}

- name: Cancel invoked run if workflow cancelled
if: ${{ cancelled() }}
uses: actions/github-script@v7
with:
github-token: ${{ secrets.SPACETIMEDB_PRIVATE_TOKEN }}
script: |
const targetOwner = process.env.TARGET_OWNER;
const targetRepo = process.env.TARGET_REPO;
const runId = Number(`${{ steps.dispatch.outputs.run_id }}`);
if (!runId) return;
await github.rest.actions.cancelWorkflowRun({
owner: targetOwner,
repo: targetRepo,
run_id: runId,
});
148 changes: 0 additions & 148 deletions .github/workflows/internal-tests.yml

This file was deleted.

Loading
Loading