Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add experimental mmap api #5679

Closed
wants to merge 3 commits into from
Closed

Conversation

splitice
Copy link
Contributor

What this PR does / why we need it:

Uses Mmap for compressed chunk data to reduce pressure on linux reclaiming blocks due to golang not returning pages to linux system memory until under high memory pressure. This can cause exteremly high CPU load peaks as all the reclaim work is deferred until the system is under pressure.

10001          9 31.1 29.8 23733032 2429272 ?    Ssl  Mar19 301:49 /usr/bin/loki -config.file=/etc/loki/config/config.yaml -target=ingester -config.expand-env=true

2GB RSS, 23GB virtual. 24GB ram system under high reclaim pressure (process stalled)

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Checklist

  • Documentation added
  • Tests updated
  • Add an entry in the CHANGELOG.md about the changes.

@splitice splitice force-pushed the feature/mmap-chunks branch from 598142c to b16b34b Compare March 20, 2022 01:52
@splitice splitice force-pushed the feature/mmap-chunks branch from 3cb0a84 to 50149ee Compare March 20, 2022 02:45
@splitice
Copy link
Contributor Author

Results from one of our clusters -

With mmap (after ~10 hours):

loki-distributed-ingester-0                        556m         740Mi
loki-distributed-ingester-1                        692m         898Mi
loki-distributed-ingester-2                        273m         725Mi

Without mmap (after ~6 hours)

loki-distributed-ingester-0                        320m         2340Mi
loki-distributed-ingester-1                        237m         2468Mi
loki-distributed-ingester-2                        159m         2120Mi

That's more like it. I don't know exact;y what Golang was doing with its GC and memory management but it's clearly incredibly wasteful. mmap for the compressed blocks is incredible by compsarison.

In it's current form this PR would probably need some changes (perhaps making it configurable?). If anyone from the loki team wants to explore merging this I'm happy to clean it up and make it configurable.

@splitice
Copy link
Contributor Author

Just over 24-hours:

name                                                      cpu              mem
loki-distributed-ingester-0                        601m         615Mi
loki-distributed-ingester-1                        619m         725Mi
loki-distributed-ingester-2                        361m         581Mi

Bloody brilliant

@splitice splitice mentioned this pull request Mar 21, 2022
@stale
Copy link

stale bot commented Apr 25, 2022

Hi! This issue has been automatically marked as stale because it has not had any
activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project.
A stalebot can be very useful in closing issues in a number of cases; the most common
is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

  • Mark issues as revivable if we think it's a valid issue but isn't something we are likely
    to prioritize in the future (the issue will still remain closed).
  • Add a keepalive label to silence the stalebot if the issue is very common/popular/important.

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task,
our sincere apologies if you find yourself at the mercy of the stalebot.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Apr 25, 2022
@stale stale bot closed this May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M stale A stale issue or PR that will automatically be closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant