gpu-mcp-server

An MCP server that exposes NVIDIA GPU metrics as tools. Any MCP-compatible AI agent (Claude, Goose, Cursor, etc.) can query real-time GPU utilization, memory, temperature, power, PCIe and NVLink throughput no Prometheus or dcgm-exporter required.

Built on the official Go MCP SDK and NVIDIA go-nvml.

Tools

Tool	Description
`list_gpus`	List all GPUs with utilization and memory info
`get_gpu_metrics`	Detailed metrics for a GPU by index or UUID
`get_gpu_processes`	PID-level GPU process attribution
`gpu_summary`	Aggregate stats across all devices

All tools support MIG (Multi-Instance GPU) - MIG instances appear as separate devices with their parent GPU's shared metrics (temperature, power, PCIe).

Sample output

Each tool returns structured JSON. The examples below show the shape of the data an agent receives from a node with two NVIDIA A100 GPUs.

list_gpus:

{
  "count": 2,
  "devices": [
    {
      "index": 0,
      "uuid": "GPU-aaaa-1111",
      "name": "NVIDIA A100-SXM4-80GB",
      "gpu_utilization_percent": 85,
      "memory_used_mib": 57344,
      "memory_total_mib": 81920
    },
    {
      "index": 1,
      "uuid": "GPU-bbbb-2222",
      "name": "NVIDIA A100-SXM4-80GB",
      "gpu_utilization_percent": 20,
      "memory_used_mib": 12288,
      "memory_total_mib": 81920
    }
  ]
}

get_gpu_metrics (with {"index": 0} or {"uuid": "GPU-aaaa-1111"}):

{
  "index": 0,
  "uuid": "GPU-aaaa-1111",
  "name": "NVIDIA A100-SXM4-80GB",
  "gpu_utilization_percent": 85,
  "memory_utilization_percent": 70,
  "memory_used_mib": 57344,
  "memory_total_mib": 81920,
  "temperature_celsius": 72,
  "power_draw_watts": 300,
  "power_limit_watts": 400,
  "pcie_tx_kbps": 0,
  "pcie_rx_kbps": 0,
  "nvlink_tx_mbps": 0,
  "nvlink_rx_mbps": 0
}

gpu_summary:

{
  "device_count": 2,
  "avg_gpu_utilization": 52.5,
  "avg_memory_utilization": 42.5,
  "total_memory_used_mib": 69632,
  "total_memory_total_mib": 163840,
  "max_temperature_celsius": 72,
  "total_power_draw_watts": 375
}

MIG instances add is_mig, parent_gpu, and mig_profile fields to the get_gpu_metrics and list_gpus payloads.

Quick start

# build (requires CGO + NVML headers on Linux)
make build

# run the server communicates over stdio
./gpu-mcp-server

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "gpu": {
      "command": "/path/to/gpu-mcp-server"
    }
  }
}

Goose

extensions:
  gpu-metrics:
    type: stdio
    cmd: /path/to/gpu-mcp-server

Cursor

Add to .cursor/mcp.json for a project, or ~/.cursor/mcp.json for all projects:

{
  "mcpServers": {
    "gpu": {
      "type": "stdio",
      "command": "/path/to/gpu-mcp-server"
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "gpu": {
      "command": "/path/to/gpu-mcp-server"
    }
  }
}

Build

Requires Go 1.23+, CGO, and NVIDIA drivers on the target machine.

make build       # compile binary
make test        # run tests (no GPU needed uses mock)
make lint        # golangci-lint
make docker      # container image

Tests use a mock collector, so they run anywhere no GPU hardware required.

Docker

Prebuilt multi-arch images (linux/amd64, linux/arm64) are published to GHCR on every release.

docker pull ghcr.io/pmady/gpu-mcp-server:latest
docker run --rm -i --gpus all ghcr.io/pmady/gpu-mcp-server:latest

The host needs the NVIDIA Container Toolkit installed for --gpus all to work. The server speaks MCP over stdio, so the -i flag is required — don't drop it.

{
  "mcpServers": {
    "gpu": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "--gpus", "all", "ghcr.io/pmady/gpu-mcp-server:latest"]
    }
  }
}

Pin a specific version via tag instead of :latest, e.g. ghcr.io/pmady/gpu-mcp-server:v0.1.0.

Architecture

Agent (Claude/Goose) ─── MCP (stdio) ──→ gpu-mcp-server ──→ NVML ──→ GPU
                                              │
                                         Tools:
                                         • list_gpus
                                         • get_gpu_metrics
                                         • gpu_summary

The server runs as a local process alongside the agent. It calls NVML directly through cgo — no sidecar, no network hops, no metric pipeline to configure.

Project info

License: Apache 2.0
Language: Go
AAIF project alignment: MCP
Related: keda-gpu-scaler (GPU autoscaling for Kubernetes)
Whitepaper: GPU-Aware Autoscaling in Cloud Native AI Infrastructure — CNCF TAG Infrastructure initiative (TOC #2188)

Roadmap

See ROADMAP.md for the 12-month public roadmap.

Contributing

See CONTRIBUTING.md for how to get involved.

Contributors

Thanks to all our contributors! Add yourself via PR.

Governance

This project follows Linux Foundation Minimum Viable Governance.

Documentation

Full documentation - hosted on Read the Docs
ROADMAP.md - public roadmap
GOVERNANCE.md - decision-making process
DEPENDENCIES.md - external dependencies and licenses
SECURITY.md - vulnerability reporting
AGENTS.md - instructions for AI agents working on this repo
CODE_OF_CONDUCT.md - community standards

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github		.github
cmd/gpu-mcp-server		cmd/gpu-mcp-server
deploy/helm/gpu-mcp-server		deploy/helm/gpu-mcp-server
docs		docs
gpu		gpu
server		server
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.readthedocs.yaml		.readthedocs.yaml
ADOPTERS.md		ADOPTERS.md
AGENTS.md		AGENTS.md
AI_GUIDELINES.md		AI_GUIDELINES.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
DEPENDENCIES.md		DEPENDENCIES.md
DESIGN_DECISION_FLEET_AGGREGATION.md		DESIGN_DECISION_FLEET_AGGREGATION.md
Dockerfile		Dockerfile
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
go.mod		go.mod
go.sum		go.sum
mkdocs.yml		mkdocs.yml
server.json		server.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gpu-mcp-server

Tools

Sample output

Quick start

Claude Desktop

Goose

Cursor

Windsurf

Build

Docker

Architecture

Project info

Roadmap

Contributing

Contributors

Governance

Documentation

Star History

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

gpu-mcp-server

Tools

Sample output

Quick start

Claude Desktop

Goose

Cursor

Windsurf

Build

Docker

Architecture

Project info

Roadmap

Contributing

Contributors

Governance

Documentation

Star History

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages