Skip to content

Loki docker driver: sporadic high CPU usage #3319

Closed
@glebsa8

Description

@glebsa8

Description

Sometimes I can observe high CPU usage by /bin/docker-driver which is not correlated with amount of logs flown through it. CPU usage tracked via htop. The only thing which always helps is docker engine restart. By "high CPU usage" I mean 20-40% instead of usual 1-9%.

To Reproduce

No stable steps to reproduce. Possible cause is temporary increased (and then decreased) amount of logs.

Environment:

We use AWS Linux v1 and v2 (both affected). Instance type is AWS t3a.small. Docker driver used in ECS instance, settings described in /etc/docker/daemon.json:

{
  "log-opts": {
    "loki-url": "http://loki:3100/loki/api/v1/push",
    "no-file": "true",
    "labels": "com.amazonaws.ecs.cluster,com.amazonaws.ecs.container-name,com.amazonaws.ecs.task-definition-family,com.amazonaws.ecs.task-definition-version",
    "loki-relabel-config": "- {action: labelmap, regex: com_amazonaws_(.+)}\n- {action: labelkeep, regex: (host|image_id|image_name|ecs_cluster|ecs_container_name|ecs_task_definition_family|ecs_task_definition_version)}\n",
    "max-buffer-size": "100m",
    "mode": "non-blocking",
    "loki-external-labels": "image_id={{.ImageID}},image_name={{.ImageName}}"
  },
  "log-driver": "loki"
}

Additional info

I understand that this issue is hard to solve without concrete steps to reproduce. If you can, please describe for me actions which I should do to help you to track down a root cause of the issue, in case it will be noticed by me (I bet it will, I saw it more than two times for past three month).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions