-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify gossip limits #4045
Clarify gossip limits #4045
Conversation
In the gossip specification, the `GOSSIP_MAX_SIZE` constant is specified for the uncompressed payload size in the gossipsub message. This PR clarifies how this limit applies to the various fields of the gossipsub message and provides additional limits derived from it that allow clients to more aggressively discard messages. In particular, clients are allowed to impose more strict limits on topics such as attestation and aggregates - an `Attestation` for example takes no more than `~228` bytes (to be verified!), far below the 10mb limit, though implicitly clients should already see these limits imposed as rejections by their SSZ decoder - this clarification mainly highlights the possibilty to perform this check earlier in the process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything here seems reasonable to me 👍
I understand from a purely spec perspective the However for implementers, it might be nice to specify With the current definition, which is defined only on the data field, implementers have to work out the total payload size in order to do this initial filtering check (which arguably is the most important). i.e they will have to take this max data size, add the expected size of the other protobuf field elements and also add the protobuf encoding overhead. I imagine this is highly prone to error and will leave clients implementing different total sizes. I suggest instead, we define the This way, all implementations will have the same total upper bound of message size being transmitted in gossipsub messages. It is easily checked, by the first few bytes of the message being sent (the protobuf varint). |
I kind of agree that it would be nice and simple to go with the total protobuf size, but it has a few problems:
I see two strategies we can take:
|
Yeah. Agreed. Perhaps to save everyone time, and to make sure all clients implement the same thing and we are not prone to mistakes due to wild calculations, we specify a few constants:
Then each implementor doesn't have to make up their own calculations + fudge factors and we don't get weird mismatch of message transmissions when we approach these limits? I don't mind adding some fudge factor/buffer (like the 1kb you suggested) to handle the max sizes, its easy and everyone can implement it. |
I like the idea of having all of these well defined so they're consistent. |
specs/phase0/p2p-interface.md
Outdated
Each gossipsub [message](https://github.com/libp2p/go-libp2p-pubsub/blob/master/pb/rpc.proto#L17-L24) has a maximum size of `GOSSIP_MAX_SIZE`. | ||
Clients MUST reject (fail validation) messages that are over this size limit. | ||
Likewise, clients MUST NOT emit or propagate messages larger than this limit. | ||
The uncompressed payload in the [`data`](https://github.com/libp2p/go-libp2p-pubsub/blob/c06df2f9a38e9382e644b241adf0e96e5ca00955/pb/rpc.proto#L19) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this same as the current scenario for many clients? uncompressed limit 10MB (lodestar for e.g.) and this means for 60m gas uncompressed may have >=10MB with zero call data tx
we need bump to 15MB no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need bump to 15MB no?
the current idea is to work on a better long-term solution instead and rely on the pectra repricing in the shorter term - the 15mb limit has several downsides.
022bb22 adds functions for computing these derived limits as well as the suggested 1kb fudge factor, in anticipation of coming up with a better way of limiting the maximimum block size Something to consider for the future is that the topic takes space from the fudge factor - we must be careful not to create massive topis or we'll exceed this fudge factor. We don't have a common fixed upper limit on topic lengths - instead, clients are expected to reject unknown topics which effectively limits their upper-bound size.
RPC already has strict fixed upper limits on everything - we can be a lot more strict there because we use chunked snappy encoding which limits size of any one "RPC frame" to 64kb + some small constant and there is no "additonal" framing to consider (ie no mixing with control messages and whatnot). We also already had in place the max compressed and max uncompressed sizes based on the same "worst case snappy" encoding - I've updated the spec to use the same functions and constants for both, for clarity. |
Hey @arnetheduck 👋 I think you just need to pull in changes from dev. There should be a button like this near the bottom of the page: I would do it myself, but this PR has the "maintainers allowed to change things" option unchecked. |
would if I could ;) Invited you for write access instead. |
It should be good now 👍 I initially thought it was a caching issue but it turns out that new functions ( |
def max_message_size() -> uint64: | ||
# Allow 1024 bytes for framing and encoding overhead but at least 1MiB in case MAX_PAYLOAD_SIZE is small. | ||
return max(max_compressed_len(MAX_PAYLOAD_SIZE) + 1024, 1024 * 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this constant is used to limit gossip gRPC message size only, may be it makes sense to give it a more explicit name?
Also maybe we could reserve more space (e.g. 1Mb instead of 1Kb) to accommodate all possible Gossip control message which could potentially be piggybacked to the publish message. That would still maintain a reasonable limit but prevent gossip implementation from corner message size cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe we could reserve more space
I don't see why, really - just send the control message in a separate frame which is trivial - basically, if you want to piggyback control messages, you can add real messages and control messages in a loop until you hit the limit and break off there - this gracefully deals with any kind of packing. You need a bounded strategy like this anyway, and the minimum max size ensures that we don't send too many small messages either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. It would help to avoid edge cases with a single message of max size, but we would hit the same problem when there are e.g. 2 message (or more) which have the cumulative size around max_message_size()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better for me
The Bellatrix spec at |
|
@jtraglia thanks for the help - I think we're gtg here! |
specs/phase0/p2p-interface.md
Outdated
The limit is set on the uncompressed payload size in particular to protect against decompression bombs - although | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an incomplete sentence. Was there something else you wanted to say here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've decided to remove this sentence since the next section dives into this. Also, I don't believe decompression bombs are practical with snappy, as there is a known, relatively low maximum compression ratio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh, not sure what I wanted to say, but removing -although would be enough I think.
from what I remember decompression bombs are still possible - "max compression" is based on an honest compressor (and used mainly to allocate a compression buffer, in "normal" snappy usage) - at least, that's how I remember the reasoning at the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Added back the remark without "- although" 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arnetheduck, please take a look at my recent changes. If they're fine with you, this PR is ready to be merged.
In the gossip specification, the
GOSSIP_MAX_SIZE
constant is specified for the uncompressed payload size in the gossipsub message.MAX_CHUNK_SIZE
serves the same purpose for req/resp and this PR unifies the two constants underMAX_PAYLOAD_SIZE
to simplify maintenance.This PR clarifies how this limit applies to the various fields of the gossipsub message and provides additional limits derived from it that allow clients to more aggressively discard messages.
In particular, clients are allowed to impose more strict limits on topics such as attestation and aggregates - an
Attestation
for example takes no more than~485
bytes (to be verified!), far below the 10mb limit, though implicitly clients should already see these limits imposed as rejections by their SSZ decoder - this clarification mainly highlights the possibilty to perform this check earlier in the process.