Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lighthouse dying because of short memory spikes #5263

Open
Aracki opened this issue Feb 19, 2024 · 6 comments
Open

Lighthouse dying because of short memory spikes #5263

Aracki opened this issue Feb 19, 2024 · 6 comments
Labels
question Further information is requested

Comments

@Aracki
Copy link

Aracki commented Feb 19, 2024

Description

Few times a day our Lighthouse containers are having short but high memory spikes. Beforehand it's showing a lot of:

ERRO Unable to validate attestation error: ObservedAttestationsError(SlotTooLow { slot: Slot(1035781), lowest_permissible_slot: Slot(1035797) }), peer_id: 16Uiu2HAkzJkSWstNAD1PR915MBhyByDbw8W1DJ5q7JJkMZa3saQd, type: "aggregated", slot: Slot(1035781), beacon_block_root: 0x74c201e1dfd6cc77dbc13ab8b93a791a0fc5f76aa0fcf42971cd0d4666aff3d1"

Besides this error, nothing useful can be seen (we haven't enabled DEBUG though).

Version

sigp/lighthouse:v4.6.0

It's interesting that it's happening only with Holesky network. We have LH for Mainnet & Sepolia as well, but we don't see these issues.

@GirnaarNodes
Copy link

we are also facing the same issue

@michaelsproul
Copy link
Member

How much memory is Lighthouse using when it OOMs? If you look at sudo dmesg -T | grep killed then it will show the resident set size (RSS).

On Holesky, Lighthouse routinely needs ~8GB of RAM. We are working on optimising this, but Holesky is inherently more resource hungry than other networks due to the higher validator count.

We've also fixed some issues for v5.0.0 which should improve memory usage, see:

And related issue:

@michaelsproul
Copy link
Member

The SlotTooLow error isn't related, that was a separate bug, also fixed for v5.0.0 but low impact:

@michaelsproul michaelsproul added the question Further information is requested label Feb 19, 2024
@Aracki
Copy link
Author

Aracki commented Feb 20, 2024

Mean usage is around ~8Gi yes, but it seems it hits our memory limit of 12Gi very often.

@chong-he
Copy link
Member

Mean usage is around ~8Gi yes, but it seems it hits our memory limit of 12Gi very often.

12GB is tough. At least 16GB is required and a 32GB is recommended, particularly on Holesky where the number of validators is high

@michaelsproul
Copy link
Member

This PR will also help: #5270

Once that's merged you could try running unstable with --state-cache-size 2. We'll release v5.1.0 with that change quite soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants