Skip to content

Queue metrics broken when using the controller builder #436

Closed
@mrIncompetent

Description

@mrIncompetent

When starting a controller the following messages get logged:

E0518 20:13:50.847341   12082 client_go_adapter.go:318] descriptor Desc{fqName: "node-application_depth", help: "Current depth of workqueue: node-application", constLabels: {}, variableLabels: []} is invalid: "node-application_depth" is not a valid metric name
E0518 20:13:50.847688   12082 client_go_adapter.go:328] descriptor Desc{fqName: "node-application_adds", help: "Total number of adds handled by workqueue: node-application", constLabels: {}, variableLabels: []} is invalid: "node-application_adds" is not a valid metric name
E0518 20:13:50.848958   12082 client_go_adapter.go:339] descriptor Desc{fqName: "node-application_queue_latency", help: "How long an item stays in workqueuenode-application before being requested.", constLabels: {}, variableLabels: []} is invalid: "node-application_queue_latency" is not a valid metric name
E0518 20:13:50.849992   12082 client_go_adapter.go:350] descriptor Desc{fqName: "node-application_work_duration", help: "How long processing an item from workqueuenode-application takes.", constLabels: {}, variableLabels: []} is invalid: "node-application_work_duration" is not a valid metric name
E0518 20:13:50.851079   12082 client_go_adapter.go:363] descriptor Desc{fqName: "node-application_unfinished_work_seconds", help: "How many seconds of work node-application has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.", constLabels: {}, variableLabels: []} is invalid: "node-application_unfinished_work_seconds" is not a valid metric name
E0518 20:13:50.852133   12082 client_go_adapter.go:374] descriptor Desc{fqName: "node-application_longest_running_processor_microseconds", help: "How many microseconds has the longest running processor for node-application been running.", constLabels: {}, variableLabels: []} is invalid: "node-application_longest_running_processor_microseconds" is not a valid metric name
E0518 20:13:50.853298   12082 client_go_adapter.go:384] descriptor Desc{fqName: "node-application_retries", help: "Total number of retries handled by workqueue: node-application", constLabels: {}, variableLabels: []} is invalid: "node-application_retries" is not a valid metric name

Used controller-runtime version:

[[constraint]]
  name = "sigs.k8s.io/controller-runtime"
  version = "v0.2.0-beta.1"

Code:

	mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{})
	if err != nil {
		log.Fatal("Unable to start manager", zap.Error(err))
	}

	err = ctrl.NewControllerManagedBy(mgr).
		For(&corev1.Node{}).
		Owns(&corev1.ConfigMap{}).
		Complete(&controller.Reconciler{
			Client: mgr.GetClient(),
			Log:    log.Named("some_controller"),
		})
	if err != nil {
		log.Fatal("Unable to add some_controller", zap.Error(err))
	}

	log.Info("starting manager")
	if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
		log.Fatal("problem running manager", zap.Error(err))
	}

The problematic code: https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/builder/build.go#L234

func (blder *Builder) getControllerName() (string, error) {
	gvk, err := getGvk(blder.apiType, blder.mgr.GetScheme())
	if err != nil {
		return "", err
	}
	name := fmt.Sprintf("%s-application", strings.ToLower(gvk.Kind))
	return name, nil
}

This leads to invalid metric names like node-application_depth.

Potential solutions IMHO:

  • Remove the -application suffix
  • Add a Named(name string) function to the builder so controllers can have more meaningful names

WDYT?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions