Correct multi-FEC block size calculations#1466
Correct multi-FEC block size calculations#1466cgutman wants to merge 1 commit intoLizardByte:nightlyfrom
Conversation
We were too conservative in determining our max data size before needing to split, which resulted in many frames being split into multiple FEC blocks unnecessarily. We also just used a hardcoded split into 3 blocks instead of actually calculating how many blocks are actually required.
|
Heh, so now that the FEC block splitting logic is correct, it's exposing scalability issues in Rather than sending 3 very loosly packed FEC blocks as we did before, we're now sending nearly full blocks of ~255 packets each. Because we send each FEC block as a single batch with So I'll have to finally write that batch throttling logic that I was putting off. For now, I can have it mimic the batching of the current code and we can optimize it using the FEC feedback messages from Moonlight later on. |
|
It looks like this PR has been idle for 90 days. If it's still something you're working on or would like to pursue, please leave a comment or update your branch. Otherwise, we'll be closing this PR in 10 days to reduce our backlog. Thanks! |
|
This PR was closed because it has been stalled for 10 days with no activity. |
Description
The old multi-FEC block calculations were way too conservative in determining our max data size before needing to split, which resulted in many frames being split into multiple FEC blocks unnecessarily. For example, a frame with 91 data shards would be split into 3 FEC blocks even though we could easily handle up to 212 data shards per block at the default FEC percentage of 20.
This issue is compounded by the fact that we always used a fixed split of 3 FEC blocks instead of calculating how many FEC blocks are actually required. Each FEC block requires separate calls to
reed_solomon_new(),reed_solomon_encode(), andWSASendMsg()/sendmsg()which is costly in CPU time. Additionally, since the protocol supports a maximum of 4 FEC blocks per frame, our hardcoded split prevented us from sending frames as large as the protocol allows.By fixing these calculations, this PR reduces CPU usage on the host, improves network utilization, and also enables up to 848 packets per frame at 20% FEC and 4096 packets per frame with 0% FEC (dynamically selected if data shard count is too high to send in 4 FEC blocks) up from 636 and 3172 respectively.
Screenshot
Issues Fixed or Closed
Type of Change
.github/...)Checklist
Branch Updates
LizardByte requires that branches be up-to-date before merging. This means that after any PR is merged, this branch
must be updated before it can be merged. You must also
Allow edits from maintainers.