Reconciler panics should not crash the manager

Currently, an unhandled panic in a reconciler will not be recovered from, and will likely cause the manager binary to crash. This is a problem, since a panic might be triggered by a single resource in an unexpected state, so that one bad resource could prevent all other resources from being processed. Since Kubernetes is likely to restart the manager pod after a crash, this can also cause the manager to DOS the Kubernetes API server as it continually restarts.

In my project, I wrote this utility function:

```go
func MakeSafe(r reconcile.Reconciler) reconcile.Reconciler {
	return safeReconciler{impl: r}
}

type safeReconciler struct {
	impl reconcile.Reconciler
}

func (r safeReconciler) Reconcile(request reconcile.Request) (result reconcile.Result, err error) {
	defer func() {
		if r := recover(); r != nil {
			result = reconcile.Result{}
			err = fmt.Errorf("panic: %v [recovered]\n\n%s", r, debug.Stack())
		}
	}()
	return r.impl.Reconcile(request)
}
```

Every time I pass a reconciler to `Complete`, I wrap it with this. It ensures that any panics raised by the reconciler are converted to normal errors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reconciler panics should not crash the manager #797

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reconciler panics should not crash the manager #797

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions