Skip to content

fix: Lift ~44KB limit for reliable payloads in the adapter #1596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jan 22, 2022

Conversation

simon-lemay-unity
Copy link
Contributor

MTT-2176

Currently there is a limit to the size of a payload that can be sent with reliable delivery with the UTP adapter. That limit is roughly 44KB (32 times the MTU of 1400 bytes). This PR removes that limitation.

The change introduced by this PR leverages (or abuses, depending on inclination) the fact that the reliable pipeline of UTP is also sequenced. That is, the reliable pipeline can be made to implement a stream. With proper framing (which BatchedSendQueue already provided), we can just send the stream of payloads to UTP without caring about payload boundaries. This only requires some changes to the send and receive queues to handle partial messages.

In other words, we are fragmenting the payloads in the adapter rather than in UTP. This can only be done easily because reliable delivery also guarantees ordering, which makes reassembly trivial. We couldn't do that easily on unreliable pipelines (which still rely on UTP to perform fragmentation).

Note that while it is now possible to send payloads larger than 44KB with reliable delivery, it is still not recommended to do so. The limit of 32 in-flight reliable packets still applies. Sending more than that means that we'll have to wait for some acknowledgements before sending the rest of the payload. (It's the same thing as sending more than the window size on a TCP connection.)

Changelog

com.unity.netcode.adapter.utp

  • Fixed: Lifted the limit of ~44KB for reliable payloads. Attempting to send a payload larger than that with reliable delivery would silently fail. Note that it is still not recommended to send such large reliable payloads, since their delivery could take a few network round-trips.

Testing and Documentation

  • Includes unit tests.
  • Includes integration tests.
  • No documentation changes or additions were necessary.

@@ -12,6 +12,8 @@ All notable changes to this package will be documented in this file. The format

### Fixed

- Lifted the limit of ~44KB for reliable payloads. Attempting to send a payload larger than that with reliable delivery would silently fail. Note that it is still not recommended to send such large reliable payloads, since their delivery could take a few network round-trips. (#1596)
Copy link
Contributor

@jeffreyrainy jeffreyrainy Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written, it is not obvious if

Attempting to send a payload larger than that with reliable delivery would silently fail.

was/is the state before or after the fix. (well, to a non-native speaker, esp. because "would")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified the entry. Please let me know if it's still not obvious.

m_Client1.Send(m_Client1.ServerClientId, payload, delivery);

yield return WaitForNetworkEvent(NetworkEvent.Data, m_ServerEvents);

Assert.AreEqual(payloadSize, m_ServerEvents[1].Data.Count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks that a payload of the correct size is delivered. However, what if there was a bug where "1111....12222....23333....3" was delivered as "1111....11111....11111....1" (repeated fragment blocks) ?
The test would miss it. I'm not saying there's an issue, but I feel the test would be more solid if it filled the payload with data and checked content on reception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I modified the test to check the payload's contents.

@ShadauxCat
Copy link
Collaborator

I feel like UTP has made a number of design decisions to optimize bandwidth and performance, and we're just undoing those optimizations one by one and ending up using more bandwidth and cpu since these things need to be tracked by UTP within its limitations and also separately by us outside of those limitations... Maybe that's fine, prioritizing usability over performance for our customers, but it makes me wonder if we could make this (and the adapter-side queueing when UTP had 32 messages in flight) configurable, so that users who don't need these huge message sizes don't have to incur the performance and bandwidth costs that come with them.

var data = new NativeArray<byte>(dataLength, Allocator.Temp);

var writer = new DataStreamWriter(data);
writer.WriteInt(42);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of this magic number? A comment would be useful... a named const would be better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, nevermind, just realized this is a test... no big deal then.

@simon-lemay-unity
Copy link
Contributor Author

I feel like UTP has made a number of design decisions to optimize bandwidth and performance, and we're just undoing those optimizations one by one and ending up using more bandwidth and cpu since these things need to be tracked by UTP within its limitations and also separately by us outside of those limitations... Maybe that's fine, prioritizing usability over performance for our customers, but it makes me wonder if we could make this (and the adapter-side queueing when UTP had 32 messages in flight) configurable, so that users who don't need these huge message sizes don't have to incur the performance and bandwidth costs that come with them.

I agree. There definitely is a mismatch between how UTP is intended to be used, and how it's used by the adapter. Mostly this is because the traffic "patterns" generated by the SDK don't match what UTP was written to handle well.

For example, sending very large reliable packets, or sending a lot of small ones are two scenarios that don't mesh well with UTP's design, but this is exactly what we're asking of the reliability pipeline in the adapter. It was not designed to handle all of a game's synchronization needs (DOTS NetCode only uses it for RPCs if I recall, and they also don't need to fragment these). Honestly, currently the SDK would almost be better served by a TCP connection.

I understand a lot of the traffic "patterns" that cause issues with UTP are concessions made to a release date. I also understand that this should get better with time (e.g. snapshots will be sent unreliably). But it does mean that most of our work on the adapter lately has been attempting to get UTP to handle the traffic from the SDK nicely, at the cost of not leveraging what UTP is good at. We do try to make things configurable (e.g. both the send queue size and the maximum payload size are configurable), but often users don't have any choice but to raise these limits. For example, currently the maximum payload size is pretty much directly related to the number of objects being synchronized. So if they want a game with a bigger scope, they're pretty much required to increase the value.

@ShadauxCat
Copy link
Collaborator

Right, but if I'm making a small four-player co-op game with only a dozen or two objects spawned, I'm not gonna run into that issue... but as currently written, I still have to accept all the costs of the code that works around those limitations. It would be really nice, thinking as a user, to be able to say "I'm building my game in a way that is compatible with UTP's design, please turn off this overhead so I have less bandwidth usage and more frame time for my game logic"

@simon-lemay-unity
Copy link
Contributor Author

simon-lemay-unity commented Jan 20, 2022

Right, but if I'm making a small four-player co-op game with only a dozen or two objects spawned, I'm not gonna run into that issue... but as currently written, I still have to accept all the costs of the code that works around those limitations. It would be really nice, thinking as a user, to be able to say "I'm building my game in a way that is compatible with UTP's design, please turn off this overhead so I have less bandwidth usage and more frame time for my game logic"

Alright, I understand your point. Thanks for the precision.

I'd argue that the adapter already handles such cases relatively well.

In terms of bandwidth, the adapter doesn't impose a lot more overhead compared to what is sent by the SDK. What we receive from the SDK is pretty much what makes it to the wire. (In fact, this PR probably helps in that regard as it makes better use of the space available in each packet). The costs are more on the latency side, but if a game generates little enough traffic that it never crosses the line of having more than 32 reliable packets in-flight, then the send queue shouldn't introduce any latency.

Well, that's not technically true. The send queue does introduce an extra memory copy since the payloads need to be written there before they're written to the UTP internal send queue (what the BeginSend and EndSend calls do). But the alternative (directly writing to the UTP internal queue in Send) would prevent jobifying the send process, where I believe there are more gains to make than what the memory copy costs.

@0xFA11 0xFA11 enabled auto-merge (squash) January 22, 2022 17:39
@0xFA11 0xFA11 merged commit c7912cd into develop Jan 22, 2022
@0xFA11 0xFA11 deleted the fix/lift-44kb-reliable-limit branch January 22, 2022 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants