Skip to content

Kibana floods logs causing crash and bootlooping #47607

@filipmnowak

Description

@filipmnowak

Kibana version:

v7.1.1

Elasticsearch version:

v7.2.0

Server OS version:

n/a

Browser version:

n/a

Browser OS version:

n/a

Original install method (e.g. download page, yum, from source, etc.):

n/a

Describe the bug:

When Kibana can't update .kibana_task_manager, it will flood logs with relatively high frequency, with same message.

In the case I was dealing with, Kibana was running in the container (ECE), and it logged message I am sharing below, around 1 million times:

{"type":"log","@timestamp":"2019-09-19T10:43:35Z","tags":["error","task_manager"],"pid":22,"message":"Failed to poll for work: [cluster_block_exception] index [.kibana_task_manager] blocked by: [FORBIDDEN/12/index read-only / allow delete (api)]; :: {\"path\":\"/.kibana_task_manager/_update/oss_telemetry-vis_telemetry\",\"query\":{\"if_seq_no\":356,\"if_primary_term\":3,\"refresh\":\"true\"},\"body\":\"{\\\"doc\\\":{\\\"type\\\":\\\"task\\\",\\\"task\\\":{\\\"taskType\\\":\\\"vis_telemetry\\\",\\\"state\\\":\\\"{\\\\\\\"runs\\\\\\\":73,\\\\\\\"stats\\\\\\\":{\\\\\\\"table\\\\\\\":{\\\\\\\"total\\\\\\\":3,\\\\\\\"spaces_min\\\\\\\":3,\\\\\\\"spaces_max\\\\\\\":3,\\\\\\\"spaces_avg\\\\\\\":3},\\\\\\\"metrics\\\\\\\":{\\\\\\\"total\\\\\\\":5,\\\\\\\"spaces_min\\\\\\\":5,\\\\\\\"spaces_max\\\\\\\":5,\\\\\\\"spaces_avg\\\\\\\":5}}}\\\",\\\"params\\\":\\\"{}\\\",\\\"attempts\\\":0,\\\"scheduledAt\\\":\\\"2019-06-06T14:59:21.492Z\\\",\\\"runAt\\\":\\\"2019-09-19T10:48:35.735Z\\\",\\\"status\\\":\\\"running\\\"},\\\"kibana\\\":{\\\"uuid\\\":\\\"ee7c7262-7970-41f2-8f7c-095ee118118c\\\",\\\"version\\\":7010199,\\\"apiVersion\\\":1}}}\",\"statusCode\":403,\"response\":\"{\\\"error\\\":{\\\"root_cause\\\":[{\\\"type\\\":\\\"cluster_block_exception\\\",\\\"reason\\\":\\\"index [.kibana_task_manager] blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];\\\"}],\\\"type\\\":\\\"cluster_block_exception\\\",\\\"reason\\\":\\\"index [.kibana_task_manager] blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];\\\"},\\\"status\\\":403}\"}"}

Container's quota was exhausted and Kibana started endlessly boot looping.

Steps to reproduce:

i've give it a quick shot on my own cluster, but i failed to trigger this effect, even after marking .kibana_task_manager read-only.

Expected behavior:

Throttling and something like "last message skipped 24213 times". i guess mechanism would be useful in general

Screenshots (if relevant):

n/a

Errors in browser console (if relevant):

n/a

Provide logs and/or server output (if relevant):

contact me on priv.

Any additional context:

Metadata

Metadata

Assignees

Labels

Team:ResponseOpsPlatform ResponseOps team (formerly the Cases and Alerting teams) t//enhancementNew value added to drive a business result

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions