-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPC chunks with streaming SSZ decoding, snappy frames, and stricter DOS limits where possible. #1606
Conversation
…f chunk contents, and put stricter DOS limits in place
@arnetheduck @AgeManning @nisdas Please have a look at this proposal. The PR description may not be that clear, the updated networking doc explains it best. |
specs/phase0/p2p-interface.md
Outdated
A reader: | ||
- SHOULD not read more than `max_encoded_len(n)` bytes (`32 + n + n/6`) after reading the SSZ length prefix `n` from the header, [this is considered the worst-case compression result by Snappy](https://github.com/google/snappy/blob/537f4ad6240e586970fe554614542e9717df7902/snappy.cc#L98). | ||
- SHOULD not accept a SSZ length prefix that is bigger than the expected maximum length for the SSZ type (derived from SSZ type information such as vector lengths and list limits). | ||
- MUST consider remaining bytes, after having read the `n` SSZ bytes, as an invalid input. An EOF is expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be clarified, if we read more than n
bytes, we should return an error on the request/response and ignore the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@protolambda can you clean this one up too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looking into this now.
Snappy has two formats: "block" and "frames" (streaming). To support large requests and response chunks, snappy-framing is used. | ||
|
||
Since snappy frame contents [have a maximum size of `65536` bytes](https://github.com/google/snappy/blob/master/framing_format.txt#L104) | ||
and frame headers are just `identifier (1) + checksum (4)` bytes, the expected buffering of a single frame is acceptable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what I understand is now we have two length prefixes encoded into our data ?
one is for snappy frames in the frame header and the other for our ssz objects in their decompressed form ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, 1 Snappy frame doesn't necessarily match 1 eth2 rpc chunk. With status quo it does match one snappy block, but it also can't stream within a chunk.
I think it's reasonable to use the framed format of snappy over the wire. It may not be super important for us now given current objects but could be more useful for larger objects in the future. In principle, we could have a |
I am pro frames. |
Clarified the invalid input handling, and cleaned up to sum the different cases better. Can I get a final review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me
This is a proposal to focus on length-encoding SSZ contents, enable streaming of chunk contents, and put stricter DOS limits in place.
ssz
without compression is still exactly the same, testnets can live on.So far I am aware that Prysm has non-streaming Snappy compression in place as option, and Lighthouse has a PR open that made a start on snappy support, but blocked by questions/ongoing network changes by Adrian. In other clients, such as Lodestar, Artemis and Nimbus, I could find the Snappy dependency, but no implementation of
ssz_snappy
for RPC.The motivation here is that it is relatively quick to calculate the encoded length of an SSZ value, since fixed-length values are easily computed from the type, and list values often from multiplying a fixed-length element size with a list length.
Given a SSZ length, a SSZ decoder can directly read contents from the stream, avoiding the need for a temporary buffer. Additionally, snappy-frames avoid the need for more than a single frame of buffered bytes to decompress data (worst case
2**16=65536
bytes), avoiding the need to buffer the compressed bytes fully.The writer just calculates the SSZ length, and can then stream-encode the contents, even if using compression. Instead of having to fully buffer the compressed data to get the compressed length.
Additionally, this puts better DOS protection in place, as we can calculate the maximum size of a SSZ object, and should use it to be stricter about inputs. And given the worst-case snappy encoded length, we can account for compressed bytes even if we don't know the exact number of compressed bytes in advance. We don't need to know the exact number, as we can just verify the decompressed bytes instead, while checking the bytes read from the stream against the limits we should be checking regardless.