Skip to content

Add Ingero - eBPF-based GPU observability Helm chart#72

Open
dml37 wants to merge 1 commit intocdwv:masterfrom
ingero-io:add-ingero
Open

Add Ingero - eBPF-based GPU observability Helm chart#72
dml37 wants to merge 1 commit intocdwv:masterfrom
ingero-io:add-ingero

Conversation

@dml37
Copy link

@dml37 dml37 commented Mar 10, 2026

What

Adds Ingero to the Application repositories section.

Ingero is an open-source, eBPF-based GPU causal observability agent. It ships with a Helm chart (deploy/helm/ingero/) that deploys as a DaemonSet with full RBAC, tracing CUDA Runtime/Driver APIs and host kernel events (scheduler, memory, I/O) to build causal chains explaining GPU latency in production Kubernetes clusters.

Key features:

  • Zero-config, <2% overhead, production-safe
  • Kernel uprobes on libcudart.so and libcuda.so (no CUPTI dependency)
  • Host kernel tracepoints (sched_switch, block I/O, TCP retransmits, network socket I/O)
  • 4-layer causal chain engine with automated root cause analysis
  • MCP server for AI-assisted GPU debugging
  • Helm chart with DaemonSet, RBAC, and ServiceAccount configuration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant