Skip to content

High cpu when using Gunicorn with multiprocess #568

Open
@onzo-mateuszzakarczemny

Description

I run flask app on Gunicorn 20.0.4 and Kubernetes. My python is v3.7.5. I have a problem with increasing CPU over a time.
image
After investigating stack traces using py-spy I noticed that the issue is caused by prometheus metrics lib.
It doesn't clean up metric files from old workers. Therefore over a time merge process take more and more CPU.

After deleting following files the CPU usages dropped significantly:

ll -lh
total 7.0M
drwxr-xr-x 2 root root 4.0K Jul 24 06:13 ./
drwxrwxrwt 1 root root 4.0K Jul 19 14:14 ../
-rw-r--r-- 1 root root 1.0M Jul 24 08:14 counter_106.db
-rw-r--r-- 1 root root 1.0M Jul 23 18:41 counter_112.db
-rw-r--r-- 1 root root 1.0M Jul 24 04:07 counter_118.db
-rw-r--r-- 1 root root 1.0M Jul 24 04:54 counter_136.db
-rw-r--r-- 1 root root 1.0M Jul 24 08:40 counter_142.db
-rw-r--r-- 1 root root 1.0M Jul 20 16:44 counter_16.db
-rw-r--r-- 1 root root 1.0M Jul 20 11:24 counter_17.db
-rw-r--r-- 1 root root 1.0M Jul 21 01:40 counter_18.db
-rw-r--r-- 1 root root 1.0M Jul 21 20:14 counter_40.db
-rw-r--r-- 1 root root 1.0M Jul 21 17:17 counter_52.db
-rw-r--r-- 1 root root 1.0M Jul 21 21:29 counter_58.db
-rw-r--r-- 1 root root 1.0M Jul 23 07:19 counter_70.db
-rw-r--r-- 1 root root 1.0M Jul 22 19:49 counter_82.db
-rw-r--r-- 1 root root 1.0M Jul 22 18:59 counter_88.db
-rw-r--r-- 1 root root 1.0M Jul 24 08:43 histogram_106.db
-rw-r--r-- 1 root root 1.0M Jul 24 04:15 histogram_112.db
-rw-r--r-- 1 root root 1.0M Jul 24 05:02 histogram_118.db
-rw-r--r-- 1 root root 1.0M Jul 24 08:43 histogram_136.db
-rw-r--r-- 1 root root 1.0M Jul 24 08:43 histogram_142.db
-rw-r--r-- 1 root root 1.0M Jul 20 16:46 histogram_16.db
-rw-r--r-- 1 root root 1.0M Jul 20 11:45 histogram_17.db
-rw-r--r-- 1 root root 1.0M Jul 21 01:51 histogram_18.db
-rw-r--r-- 1 root root 1.0M Jul 21 22:41 histogram_40.db
-rw-r--r-- 1 root root 1.0M Jul 21 17:45 histogram_52.db
-rw-r--r-- 1 root root 1.0M Jul 22 01:44 histogram_58.db
-rw-r--r-- 1 root root 1.0M Jul 23 07:37 histogram_70.db
-rw-r--r-- 1 root root 1.0M Jul 23 01:01 histogram_82.db
-rw-r--r-- 1 root root 1.0M Jul 22 23:40 histogram_88.db

image

The issue is related to #275

Can we avoid that somehow without periodically restating the k8s pod? Maybe multiprocess should use PID + UUID generated on worker rather than just PID for file names? So master could remove/merge files from dead workes?

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions