Tekton pipelines for running LLM inference benchmarks on OpenShift with multiple deployment modes and benchmark tools.
llm-d-bench provides automated end-to-end benchmarking pipelines for LLM inference workloads. It supports:
- Multiple deployment platforms: llm-d, RHOAI (KServe), RHAIIS (Pods)
- Benchmark tools: GuideLLM (load testing), MLPerf (standardized)
- Advanced features: PD Disaggregation, Precise Prefix Caching, Inference Scheduling
- MLflow integration: Automated experiment tracking and metrics storage
- OpenShift 4.14+
- Tekton Pipelines v0.50+
ocCLI
# Install Tekton Pipelines operator
oc apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
# Create namespace
oc create namespace llm-d-bench
# Install pipelines and tasks
./scripts/install.sh -n llm-d-bench --with-pvcs
# Optional: Install Kueue for GPU quota management
# ./scripts/install.sh -n llm-d-bench --with-infra --with-pvcsSee Storage Configuration for PVC setup details.
# HuggingFace token (required)
oc create secret generic huggingface-token \
--from-literal=HF_TOKEN=hf_xxxxxxxxxxxxx \
-n llm-d-bench
# MLflow credentials (optional)
oc create secret generic mlflow-ui-auth \
--from-literal=username=admin \
--from-literal=password=your-password \
--from-literal=tracking-uri=https://mlflow-server.example.com \
-n llm-d-bench
oc create secret generic mlflow-s3-secret \
--from-literal=access-key=your-access-key \
--from-literal=secret-key=your-secret-key \
--from-literal=bucket-name=mlflow-artifacts \
--from-literal=region=us-east-1 \
-n llm-d-benchOr use templates from config/cluster/secrets/.
# RHOAI example (KServe)
oc create -f pipelineruns/rhoai/qwen-qwen3-06b-example.yaml -n llm-d-bench
# llm-d example (Helmfile deployment)
oc create -f pipelineruns/llm-d/redhatai-llama-3.3-70b-instruct-fp8-dynamic-1k-1k.yaml -n llm-d-bench
# Watch logs
tkn pipelinerun logs -f -n llm-d-benchMore examples: pipelineruns/llm-d/, pipelineruns/rhoai/, pipelineruns/rhaiis/
| Mode | Description | Use Case |
|---|---|---|
| llm-d | Helmfile-based deployment with EPP/GAIE | Advanced scheduling, PD disaggregation, prefix caching |
| RHOAI | KServe LLMInferenceService | Production RHOAI environments (3.0+) |
| RHAIIS | Direct Pod deployment | Simple testing, development |
See docs/ADVANCED.md for detailed configuration.
Pre-built container images are available from GitHub Container Registry:
-
GuideLLM:
ghcr.io/openshift-psap/llm-d-bench/guidellm:latest- Load testing with configurable concurrency levels
- Detailed latency and throughput metrics
-
MLPerf:
ghcr.io/openshift-psap/llm-d-bench/mlperf:latest- Standardized benchmark scenarios (Offline, Server)
- Requires dataset upload to PVC
Images are automatically built via GitHub Actions when changes are merged to main. See docs/ADVANCED.md#building-images-locally for local development.
- MLflow (
MLFLOW_ENABLED=true): Centralized tracking with S3 storage → Setup Guide - PVC (
MLFLOW_ENABLED=false): Local storage at/benchmark-results/(JSON, HTML, logs) - Tekton Results (cluster-wide): Long-term PipelineRun/TaskRun storage with queryable API → Setup Guide
- docs/ADVANCED.md - Detailed configuration, PD disaggregation, custom tasks, troubleshooting
- docs/STORAGE.md - PVC configuration and access modes
- docs/MLFLOW.md - MLflow integration and experiment tracking
- docs/KUEUE.md - GPU quota management with Kueue
- docs/TEKTON.md - Tekton Dashboard and Tekton Results installation and S3 log storage
- docs/EXPERIMENTS.md - CI/CD integration and GitHub Runners
| Pipeline | Tasks |
|---|---|
llm-d-end-to-end-benchmark |
download → deploy-llm-d → wait → benchmark → cleanup |
rhoai-end-to-end-benchmark |
download → deploy-rhoai → wait → benchmark → cleanup |
rhaiis-end-to-end-benchmark |
download → deploy-rhaiis → wait → benchmark → cleanup |
| Pipeline | Tasks |
|---|---|
guidellm-run-benchmark-pipeline |
wait-for-endpoint → run-benchmark |
mlperf-run-benchmark-pipeline |
wait-for-endpoint → run-benchmark |
# View pipeline runs
oc get pipelinerun -n llm-d-bench
# View logs
tkn pipelinerun logs <pipelinerun-name> -f -n llm-d-bench
# View specific task logs
tkn pipelinerun logs <pipelinerun-name> -t run-benchmark -n llm-d-bench
# Check pod status
oc get pods -n llm-d-bench
# Describe failed pipeline
oc describe pipelinerun <pipelinerun-name> -n llm-d-benchmacOS:
brew install tektoncd-cliLinux:
curl -LO https://github.com/tektoncd/cli/releases/download/v0.38.0/tkn_0.38.0_Linux_x86_64.tar.gz
tar xvzf tkn_0.38.0_Linux_x86_64.tar.gz -C /usr/local/bin/ tknSee docs/ADVANCED.md for information on:
- Creating custom tasks and pipelines
- Adding new benchmark tools
- Repository structure
Apache 2.0