Skip to content

Do not report problems until kube-apiserver is ready #295

Closed
@yguo0905

Description

In #288, we changed NPD to run custom plugins on startup. I hoped this would allow NPD to always report an event immediately when the cluster is just created, no matter how big the invoke_internal is.

However, this will not always work due to its interaction with kube-apiserver. What I observed during cluster creation was below.

  1. NPD started and invoked the custom plugin immediately, and then sent an event to kube-apiserver.
  2. The event was failed to be sent because kube-apiserver was not running yet. The event library will retry sending the event.
    Unable to write event: 'Post https://x.x.x.x/api/v1/namespaces/default/events: dial tcp 3 4.68.6.201:443: connect: connection refused' (may retry after sleeping)
  3. kube-apiserver started.
  4. The event was re-sent to kube-apiserver but was rejected this time without further retry because of a permission error:
    events is forbidden: User "system:node-problem-detector" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)
  5. https://github.com/kubernetes/kubernetes/blob/c8b45cd25c18e65798dde49fc7011495ea6021d5/cluster/gce/gci/configure-helper.sh#L568 was called to set up the permission.

There is a small window between (3) and (5) - if the event is rejected during that interval the event will never be resent again.

Changing the event library to always retry on permission error may or may not make sense. But what we can do in NPD is to introduce a configurable initial_delay for custom plugins. In this case, I can configure it to 1m with invoke_internal still being 6h. The plugin will run after 1m when the NPD starts.

/cc @wangzhen127 @Random-Liu

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions