Introduction

Practice Kubernetes troubleshooting with realistic error scenarios.

Each scenario is run with kubectl apply commands. To cleanup, run kubectl delete on the same.

Simple Scenarios

Crashing Pod (CrashLoopBackoff)

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/crashpod/broken.yaml

To get notifications like below, install Robusta:

OOMKilled Pod (Out of Memory Kill)

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/oomkill/oomkill_job.yaml

To get notifications like below, install Robusta:

High CPU Throttling (CPUThrottlingHigh)

Apply the following YAML and wait 15 minutes. (CPU throttling is only an issue if it occurs for a meaningful period of time. Less than 15 minutes of throttling typically does not trigger an alert.)

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/cpu_throttling/throttling.yaml

To get notifications like below, install Robusta:

Pending Pod (Unschedulable due to Node Selectors)

Apply the following YAML and wait 15 minutes. (By default, most systems only alert after pods are pending for 15 minutes. This prevents false alarms on autoscaled clusters, where it's OK for pods to be temporarily pending.)

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/pending_pods/pending_pod_node_selector.yaml

To get notifications like below, install Robusta:

ImagePullBackOff

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/image_pull_backoff/no_such_image.yaml

To get notifications like below, install Robusta:

Liveness Probe Failure

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/liveness_probe_fail/failing_liveness_probe.yaml

To get notifications like below, install Robusta:

Readiness Probe Failure

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/readiness_probe_fail/failing_readiness_probe.yaml

Job Failure

The job will fail after 60 seconds, then attempt to run again. After two attempts, it will fail for good.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/job_failure/job_crash.yaml

To get notifications like below, install Robusta:

Failed Helm Releases

Deliberately deploy a failing Helm release:

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update
helm install kubewatch robusta/kubewatch --set='rbac.create=true,updateStrategy.type=Error' --namespace demo-namespace

Upgrade the release so it succeeds:

helm upgrade kubewatch robusta/kubewatch --set='rbac.create=true' --namespace demo-namespace --create-namespace

Clean up by removing the release and deleting the namespace:

helm del kubewatch  --namespace demo-namespace 
kubectl delete namespace demo-namespace

To get notifications like below, install Robusta and setup Helm Releases Monitoring

Advanced Scenarios

Correlate Changes and Errors

Deploy a healthy pod. Then break it.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/crashpod/healthy.yaml
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/crashpod/broken.yaml

If someone else made this change, would you be able to immediately pinpoint the change that broke the application?

To get notifications like below, install Robusta.

Track Deployment Changes

Create an nginx deployment. Then simulate multiple unexpected changes to this deployment.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/deployment_image_change/before_image_change.yaml
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/deployment_image_change/after_image_change.yaml

To get notifications like below, install Robusta and setup Kubernetes change tracking

Track Ingress Changes

Create an ingress. Then changes its path and secretName to simulate an unexpected ingress modification.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/ingress_port_path_change/before_port_path_change.yaml
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/ingress_port_path_change/after_port_path_change.yaml

To get notifications like below, install Robusta and setup Kubernetes change tracking

Drift Detection and Namespace Diff

Deploy two variants of the same application in different namespaces:

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/namespace_drift/example.yaml

Can you quickly tell the difference between the compare1 and compare2 namespaces? What is the drift between them?

To do so with Robusta, install Robusta and enable the UI.

Inefficient GKE Nodes

On GKE, nodes can reserve more than 50% of CPU for themselves. Users pay for CPU that is unavailable to applications.

Reproduction:

Create a default GKE cluster with autopilot disabled. Don't change any other settings.
Deploy the following pod:

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/gke_node_allocatable/gke_issue.yaml

Run kubectl get pods -o wide gke-node-allocatable-issue

The pod will be Pending. A Pod requesting 1 CPU cannot run on an empty node with 2 CPUs!

To see problems like this with Robusta, install Robusta and enable the UI.

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
cpu_throttling		cpu_throttling
crashloop_backoff		crashloop_backoff
crashpod.v2		crashpod.v2
crashpod		crashpod
deployment_image_change		deployment_image_change
evictions		evictions
example_images		example_images
gke_node_allocatable		gke_node_allocatable
holmes-meme-generator		holmes-meme-generator
image_pull_backoff		image_pull_backoff
ingress_port_path_change		ingress_port_path_change
init_crashloop_backoff		init_crashloop_backoff
job_failure		job_failure
job_run_forever		job_run_forever
liveness_probe_fail		liveness_probe_fail
memory_pressure_evictions		memory_pressure_evictions
minishop-telemetry		minishop-telemetry
namespace_drift		namespace_drift
oomkill		oomkill
pending_pods		pending_pods
prometheus_rule_failure		prometheus_rule_failure
pvc-misconfiguration		pvc-misconfiguration
readiness_probe_fail		readiness_probe_fail
slow-rds-query		slow-rds-query
sock-shop		sock-shop
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
process_data.py		process_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Simple Scenarios

Advanced Scenarios

About

Contributors 10

Languages

License

robusta-dev/kubernetes-demos

Folders and files

Latest commit

History

Repository files navigation

Introduction

Simple Scenarios

Advanced Scenarios

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 10

Languages