Aggregate expired pid db files, control the number of files and improve scrape effectiveness

history related issues: https://github.com/prometheus/client_python/pull/441 
 https://github.com/prometheus/client_python/pull/430

Can we aggregate all the db files from a period of time ago and non-current pid into a total db file, to control the number of pid files ？

I have realized this idea with golang，here are some details：

**Project deploy info：  
gunicorn django  
128 workers  
gunicorn max_requests：10000（create a new pid file almost every minute）**


1. I can't solve the problem that the pid file has been growing, and it can reach 6,000 in four days；
2. Try to delete the expired pid regularly in the code, but it will cause the figure to drop with grafana；
3. The time to request metric is getting longer as the program runs.


Improve scrape efficiency:
I used golang to rewrite the logic of python aggregate metrics（generate metric still using python）. After rewriting, each scrape time is less than 1 second. 

Solve the growing pid files:
Aggregate all the db files from a period of time ago and non-current pid into a total db file. Then delete these files. When calculating metric, history total db + curent pid = current pid db. (i do it every hour)

Now, num of pid files is <200 in my project. if we can do change this, it would be a big strengthen. Just like prometheus will also aggregate historical data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Aggregate expired pid db files, control the number of files and improve scrape effectiveness #443

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Aggregate expired pid db files, control the number of files and improve scrape effectiveness #443

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions