-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubo OOM and locks our servers when pinning a big amount of CID #10011
Comments
Cc @marten-seemann could you please take a look at those profiles ? |
Maybe related: quic-go/quic-go#3883. |
Thank you for the feedback, we have relaunched both nodes without anything using QUIC for now. |
@Mayeu how did you do that exactly ? |
@Jorropo I deactivated both
Hard to tell for now, RAM growth seems pretty similar than with the previous configuration, previously kubo was killed after 9-11h of uptime. Left part is our previous run until this morning when the server started to stop answering. Right part is since we deactivated QUIC. Here is a profile taken right now. We definitely have seen a drop in pin/s since this morning, maybe some of our peers were only using QUIC. |
A note on the profile in my last comment, there is apparently still memory allocated to QUIC, this may relate to #9895 |
Yes thx, that is good, I wanted to be sure you didn't just removed the quic multiaddresses. |
Triage notes:
|
Checklist
Installation method
built from source
Version
Config
Description
Hello,
In the past month, we have been slowly pinning millions of CID with a 2 server ipfs cluster. Currently we are around 9M pinned CID on a total of 13.5M. Kubo has regularly been killed by the system for consuming all the memory, and from time to time it even completely locks out our servers and require a hard reboot.
We were waiting for the 0.21.0 release to open this ticket since we thought that the release would reduce RAM consumption, but in the past 24h both our servers have locked up again.
Both servers have the following spec:
We have tried a lot of different configuration, including disabling bandwidth metrics, disabling being a DHT server, and to activate or deactivate the Accelerated DHT client, but whatever the configuration we tried kubo always end-up consuming all available memory.
We are currently running Kubo with
GOGC=50
andGOMEMLIMIT=80GiB
Here are two
ipfs diag profile
taken today:In case it's relevant, for the ipfs-cluster we followed the setup guide in the documentation, we are keeping around 50k pin in the queue.
ipfs-cluster configuration
The text was updated successfully, but these errors were encountered: