An MCP server that exposes NVIDIA GPU metrics as tools. Any MCP-compatible AI agent (Claude, Goose, Cursor, etc.) can query real-time GPU utilization, memory, temperature, power, PCIe and NVLink throughput no Prometheus or dcgm-exporter required.
Built on the official Go MCP SDK and NVIDIA go-nvml.
| Tool | Description |
|---|---|
list_gpus |
List all GPUs with utilization and memory info |
get_gpu_metrics |
Detailed metrics for a GPU by index or UUID |
get_gpu_processes |
PID-level GPU process attribution |
gpu_summary |
Aggregate stats across all devices |
All tools support MIG (Multi-Instance GPU) - MIG instances appear as separate devices with their parent GPU's shared metrics (temperature, power, PCIe).
Each tool returns structured JSON. The examples below show the shape of the data an agent receives from a node with two NVIDIA A100 GPUs.
list_gpus:
{
"count": 2,
"devices": [
{
"index": 0,
"uuid": "GPU-aaaa-1111",
"name": "NVIDIA A100-SXM4-80GB",
"gpu_utilization_percent": 85,
"memory_used_mib": 57344,
"memory_total_mib": 81920
},
{
"index": 1,
"uuid": "GPU-bbbb-2222",
"name": "NVIDIA A100-SXM4-80GB",
"gpu_utilization_percent": 20,
"memory_used_mib": 12288,
"memory_total_mib": 81920
}
]
}get_gpu_metrics (with {"index": 0} or {"uuid": "GPU-aaaa-1111"}):
{
"index": 0,
"uuid": "GPU-aaaa-1111",
"name": "NVIDIA A100-SXM4-80GB",
"gpu_utilization_percent": 85,
"memory_utilization_percent": 70,
"memory_used_mib": 57344,
"memory_total_mib": 81920,
"temperature_celsius": 72,
"power_draw_watts": 300,
"power_limit_watts": 400,
"pcie_tx_kbps": 0,
"pcie_rx_kbps": 0,
"nvlink_tx_mbps": 0,
"nvlink_rx_mbps": 0
}gpu_summary:
{
"device_count": 2,
"avg_gpu_utilization": 52.5,
"avg_memory_utilization": 42.5,
"total_memory_used_mib": 69632,
"total_memory_total_mib": 163840,
"max_temperature_celsius": 72,
"total_power_draw_watts": 375
}MIG instances add is_mig, parent_gpu, and mig_profile fields to the
get_gpu_metrics and list_gpus payloads.
# build (requires CGO + NVML headers on Linux)
make build
# run the server communicates over stdio
./gpu-mcp-serverAdd to claude_desktop_config.json:
{
"mcpServers": {
"gpu": {
"command": "/path/to/gpu-mcp-server"
}
}
}extensions:
gpu-metrics:
type: stdio
cmd: /path/to/gpu-mcp-serverAdd to .cursor/mcp.json for a project, or ~/.cursor/mcp.json for all
projects:
{
"mcpServers": {
"gpu": {
"type": "stdio",
"command": "/path/to/gpu-mcp-server"
}
}
}Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"gpu": {
"command": "/path/to/gpu-mcp-server"
}
}
}Requires Go 1.23+, CGO, and NVIDIA drivers on the target machine.
make build # compile binary
make test # run tests (no GPU needed uses mock)
make lint # golangci-lint
make docker # container imageTests use a mock collector, so they run anywhere no GPU hardware required.
Prebuilt multi-arch images (linux/amd64, linux/arm64) are published to GHCR on every release.
docker pull ghcr.io/pmady/gpu-mcp-server:latest
docker run --rm -i --gpus all ghcr.io/pmady/gpu-mcp-server:latestThe host needs the NVIDIA Container Toolkit
installed for --gpus all to work. The server speaks MCP over stdio, so the
-i flag is required — don't drop it.
{
"mcpServers": {
"gpu": {
"command": "docker",
"args": ["run", "--rm", "-i", "--gpus", "all", "ghcr.io/pmady/gpu-mcp-server:latest"]
}
}
}Pin a specific version via tag instead of :latest, e.g. ghcr.io/pmady/gpu-mcp-server:v0.1.0.
Agent (Claude/Goose) ─── MCP (stdio) ──→ gpu-mcp-server ──→ NVML ──→ GPU
│
Tools:
• list_gpus
• get_gpu_metrics
• gpu_summary
The server runs as a local process alongside the agent. It calls NVML directly through cgo — no sidecar, no network hops, no metric pipeline to configure.
- License: Apache 2.0
- Language: Go
- AAIF project alignment: MCP
- Related: keda-gpu-scaler (GPU autoscaling for Kubernetes)
- Whitepaper: GPU-Aware Autoscaling in Cloud Native AI Infrastructure — CNCF TAG Infrastructure initiative (TOC #2188)
See ROADMAP.md for the 12-month public roadmap.
See CONTRIBUTING.md for how to get involved.
Thanks to all our contributors! Add yourself via PR.
This project follows Linux Foundation Minimum Viable Governance.
- Full documentation - hosted on Read the Docs
- ROADMAP.md - public roadmap
- GOVERNANCE.md - decision-making process
- DEPENDENCIES.md - external dependencies and licenses
- SECURITY.md - vulnerability reporting
- AGENTS.md - instructions for AI agents working on this repo
- CODE_OF_CONDUCT.md - community standards