Skip to content

Conversation

@WaterWhisperer
Copy link
Contributor

Related issue: #2843

Copilot AI review requested due to automatic review settings November 10, 2025 08:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements HTTP Datagram support for CONNECT-UDP sessions, removing the TODO placeholder. The implementation adds capsule encoding/decoding functionality and integrates it with the CONNECT-UDP frame handling.

  • Implements Capsule type for encoding/decoding HTTP datagrams
  • Updates ConnectUdpFrame to handle datagram frames with payload
  • Integrates datagram handling in CONNECT-UDP session processing

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
neqo-http3/src/frames/mod.rs Adds the new capsule module to the frames module
neqo-http3/src/frames/capsule.rs New file implementing Capsule enum with datagram encoding/decoding and comprehensive tests
neqo-http3/src/frames/connect_udp_frame.rs Updates Frame enum and decoder to handle datagram frames using capsules
neqo-http3/src/features/extended_connect/connect_udp_session.rs Replaces TODO with actual datagram handling logic to emit datagram events

@larseggert larseggert linked an issue Nov 10, 2025 that may be closed by this pull request
@WaterWhisperer WaterWhisperer force-pushed the capsule branch 2 times, most recently from af6f3d7 to 681a9b3 Compare November 10, 2025 09:35
@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

❌ Patch coverage is 95.58824% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.45%. Comparing base (83e264c) to head (d3c0cab).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3122      +/-   ##
==========================================
- Coverage   93.47%   93.45%   -0.02%     
==========================================
  Files         125      125              
  Lines       36663    36813     +150     
  Branches    36663    36813     +150     
==========================================
+ Hits        34270    34405     +135     
- Misses       1543     1558      +15     
  Partials      850      850              
Components Coverage Δ
neqo-common 97.43% <ø> (ø)
neqo-crypto 83.19% <ø> (-0.49%) ⬇️
neqo-http3 93.30% <95.58%> (+0.03%) ⬆️
neqo-qpack 94.40% <ø> (ø)
neqo-transport 94.56% <ø> (+<0.01%) ⬆️
neqo-udp 78.94% <ø> (ø)
mtu 85.44% <ø> (ø)

@github-actions
Copy link
Contributor

github-actions bot commented Nov 10, 2025

🐰 Bencher Report

Branchcapsule
TestbedOn-prem
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
google-neqo-cubic📈 view plot
🚷 view threshold
273.25 ms
(-1.49%)Baseline: 277.39 ms
286.78 ms
(95.28%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
msquic-neqo-cubic📈 view plot
🚷 view threshold
185.95 ms
(-12.30%)Baseline: 212.03 ms
245.28 ms
(75.81%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo-google-cubic📈 view plot
🚷 view threshold
744.24 ms
(-2.60%)Baseline: 764.10 ms
789.95 ms
(94.21%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo-msquic-cubic📈 view plot
🚷 view threshold
160.15 ms
(+1.33%)Baseline: 158.05 ms
161.65 ms
(99.07%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo-neqo-cubic-nopacing📈 view plot
🚷 view threshold
95.95 ms
(-0.39%)Baseline: 96.33 ms
98.24 ms
(97.68%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo-neqo-cubic📈 view plot
🚷 view threshold
96.85 ms
(-0.52%)Baseline: 97.36 ms
99.74 ms
(97.11%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo-neqo-reno-nopacing📈 view plot
🚷 view threshold
95.99 ms
(-0.51%)Baseline: 96.48 ms
99.18 ms
(96.78%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo-neqo-reno📈 view plot
🚷 view threshold
96.25 ms
(-1.21%)Baseline: 97.43 ms
99.69 ms
(96.55%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo-quiche-cubic📈 view plot
🚷 view threshold
190.86 ms
(-0.80%)Baseline: 192.40 ms
195.22 ms
(97.77%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
neqo-s2n-cubic📈 view plot
🚷 view threshold
219.42 ms
(-0.57%)Baseline: 220.67 ms
223.92 ms
(97.99%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
quiche-neqo-cubic📈 view plot
🚷 view threshold
154.36 ms
(+0.33%)Baseline: 153.85 ms
157.08 ms
(98.27%)
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
s2n-neqo-cubic📈 view plot
🚷 view threshold
174.06 ms
(+0.11%)Baseline: 173.86 ms
177.11 ms
(98.28%)
🐰 View full continuous benchmarking report in Bencher

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done!

I believe we would now as well need to send the capsule-protocol HTTP header on the EXTENDED_CONNECT stream.

https://www.rfc-editor.org/rfc/rfc9297#section-3.4

As far as I understand, this allows a user to send a capsule to us.

In addition, could you add an end-to-end test showcasing how a neqo-client and a neqo-server (neither supporting QUIC datagrams) will exchange (proxy) UDP datagrams using connect-udp and HTTP DATAGRAM capsules? session_lifecycle in the file below should be a good example.

#[test]
fn session_lifecycle_client_closes() {
session_lifecycle(true);
}

@github-actions
Copy link
Contributor

github-actions bot commented Nov 10, 2025

🐰 Bencher Report

Branchcapsule
TestbedOn-prem
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
nanoseconds (ns)
(Result Δ%)
Upper Boundary
nanoseconds (ns)
(Limit %)
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client📈 view plot
🚷 view threshold
210,260,000.00 ns
(+1.14%)Baseline: 207,894,821.43 ns
217,129,540.03 ns
(96.84%)
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client📈 view plot
🚷 view threshold
203,810,000.00 ns
(+0.82%)Baseline: 202,152,250.00 ns
211,933,263.10 ns
(96.17%)
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client📈 view plot
🚷 view threshold
38,746,000.00 ns
(+12.79%)Baseline: 34,351,228.57 ns
46,229,600.62 ns
(83.81%)
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client📈 view plot
🚷 view threshold
280,360,000.00 ns
(-2.90%)Baseline: 288,724,839.29 ns
301,919,557.58 ns
(92.86%)
1-streams/each-1000-bytes/simulated-time📈 view plot
🚷 view threshold
119,060,000.00 ns
(+0.20%)Baseline: 118,827,928.57 ns
120,495,688.27 ns
(98.81%)
1-streams/each-1000-bytes/wallclock-time📈 view plot
🚷 view threshold
586,330.00 ns
(-0.58%)Baseline: 589,731.62 ns
610,112.88 ns
(96.10%)
1000-streams/each-1-bytes/simulated-time📈 view plot
🚷 view threshold
2,333,500,000.00 ns
(-74.66%)Baseline: 9,207,083,571.43 ns
23,924,335,021.97 ns
(9.75%)
1000-streams/each-1-bytes/wallclock-time📈 view plot
🚷 view threshold
12,495,000.00 ns
(-6.18%)Baseline: 13,318,412.50 ns
15,156,408.72 ns
(82.44%)
1000-streams/each-1000-bytes/simulated-time📈 view plot
🚷 view threshold
16,143,000,000.00 ns
(-9.18%)Baseline: 17,774,350,000.00 ns
20,812,994,933.37 ns
(77.56%)
1000-streams/each-1000-bytes/wallclock-time📈 view plot
🚷 view threshold
50,677,000.00 ns
(+0.33%)Baseline: 50,511,680.36 ns
55,617,933.93 ns
(91.12%)
RxStreamOrderer::inbound_frame()📈 view plot
🚷 view threshold
108,660,000.00 ns
(-0.96%)Baseline: 109,712,821.43 ns
111,397,083.57 ns
(97.54%)
coalesce_acked_from_zero 1+1 entries📈 view plot
🚷 view threshold
89.53 ns
(+0.44%)Baseline: 89.13 ns
90.45 ns
(98.98%)
coalesce_acked_from_zero 10+1 entries📈 view plot
🚷 view threshold
105.47 ns
(-0.53%)Baseline: 106.03 ns
107.10 ns
(98.48%)
coalesce_acked_from_zero 1000+1 entries📈 view plot
🚷 view threshold
90.34 ns
(-0.66%)Baseline: 90.93 ns
95.32 ns
(94.77%)
coalesce_acked_from_zero 3+1 entries📈 view plot
🚷 view threshold
106.17 ns
(-0.34%)Baseline: 106.53 ns
107.52 ns
(98.74%)
decode 1048576 bytes, mask 3f📈 view plot
🚷 view threshold
1,417,800.00 ns
(-19.25%)Baseline: 1,755,856.96 ns
2,544,637.71 ns
(55.72%)
decode 1048576 bytes, mask 7f📈 view plot
🚷 view threshold
1,477,400.00 ns
(-69.18%)Baseline: 4,794,032.86 ns
6,760,085.24 ns
(21.85%)
decode 1048576 bytes, mask ff📈 view plot
🚷 view threshold
1,162,700.00 ns
(-59.77%)Baseline: 2,890,205.71 ns
3,867,024.67 ns
(30.07%)
decode 4096 bytes, mask 3f📈 view plot
🚷 view threshold
5,554.00 ns
(-23.54%)Baseline: 7,264.10 ns
11,085.54 ns
(50.10%)
decode 4096 bytes, mask 7f📈 view plot
🚷 view threshold
5,798.10 ns
(-69.02%)Baseline: 18,713.87 ns
26,392.78 ns
(21.97%)
decode 4096 bytes, mask ff📈 view plot
🚷 view threshold
4,514.60 ns
(-58.56%)Baseline: 10,893.13 ns
14,533.02 ns
(31.06%)
sent::Packets::take_ranges📈 view plot
🚷 view threshold
4,579.30 ns
(-2.16%)Baseline: 4,680.58 ns
4,921.96 ns
(93.04%)
transfer/pacing-false/same-seed/simulated-time/run📈 view plot
🚷 view threshold
23,941,000,000.00 ns
(-5.08%)Baseline: 25,221,815,412.19 ns
26,310,738,967.69 ns
(90.99%)
transfer/pacing-false/same-seed/wallclock-time/run📈 view plot
🚷 view threshold
23,853,000.00 ns
(-5.57%)Baseline: 25,260,467.74 ns
27,112,489.43 ns
(87.98%)
transfer/pacing-false/varying-seeds/simulated-time/run📈 view plot
🚷 view threshold
23,941,000,000.00 ns
(-4.49%)Baseline: 25,065,437,275.99 ns
25,907,874,995.80 ns
(92.41%)
transfer/pacing-false/varying-seeds/wallclock-time/run📈 view plot
🚷 view threshold
23,786,000.00 ns
(-6.32%)Baseline: 25,390,232.97 ns
27,383,061.00 ns
(86.86%)
transfer/pacing-true/same-seed/simulated-time/run📈 view plot
🚷 view threshold
23,676,000,000.00 ns
(-6.28%)Baseline: 25,261,526,881.72 ns
26,571,289,076.49 ns
(89.10%)
transfer/pacing-true/same-seed/wallclock-time/run📈 view plot
🚷 view threshold
24,226,000.00 ns
(-8.20%)Baseline: 26,388,578.85 ns
28,923,310.30 ns
(83.76%)
transfer/pacing-true/varying-seeds/simulated-time/run📈 view plot
🚷 view threshold
23,676,000,000.00 ns
(-4.81%)Baseline: 24,871,154,121.86 ns
25,767,106,382.16 ns
(91.88%)
transfer/pacing-true/varying-seeds/wallclock-time/run📈 view plot
🚷 view threshold
23,970,000.00 ns
(-7.40%)Baseline: 25,884,982.08 ns
27,981,297.01 ns
(85.66%)
🐰 View full continuous benchmarking report in Bencher

@mxinden
Copy link
Member

mxinden commented Nov 13, 2025

@WaterWhisperer let me know if you need any help with the above. I saw you resolved all comments but didn't make any changes yet.

@WaterWhisperer
Copy link
Contributor Author

WaterWhisperer commented Nov 13, 2025

@WaterWhisperer let me know if you need any help with the above. I saw you resolved all comments but didn't make any changes yet.

@mxinden Thanks for your attention! Yeah, I've resolved all the comments. However, when I was adding

an end-to-end test showing how a neqo-client and a neqo-server (neither supporting QUIC datagrams) will exchange (proxy) UDP datagrams using connect-udp and HTTP DATAGRAM capsules

, I found the test wasn't working. I then realized that my implementation is incomplete

Currently, my changes support:

  • Decoding HTTP DATAGRAM Capsules from the stream (receiving side works)
  • The capsule-protocol header in responses

However, the sending side still only uses QUIC datagrams. I haven't implemented sending via HTTP DATAGRAM Capsules on the stream yet. I'm trying my best on completing this to fully meet the RFC 9297 requirements.

I'll update the PR once I have a working end-to-end test. Thanks again for your follow-up!

@mxinden
Copy link
Member

mxinden commented Nov 13, 2025

Great work. I appreciate the help!

@WaterWhisperer
Copy link
Contributor Author

I've encountered a problem: I'm trying to send a Capsule using SendMessage::send_data(), but this method automatically wraps it in a DATA frame.
The final data structure becomes [DATA frame][Capsule], and the receiving end's FrameReader will parse the DATA frame and treat the contents of the Capsule as the frame payload.
My solution was to bypass the HTTP/3 framing layer and read/write raw bytes directly on the stream, manually handling Capsule encoding/decoding. However, this made the connect_udp_frame dead_code, which intuitively tells me is wrong, but I can't think of a better way.

@larseggert larseggert requested a review from mxinden November 26, 2025 13:26
@larseggert
Copy link
Collaborator

@mxinden is busy at the moment, so please don't take his silence as us having lost interest here. Will take us a few more days to respond. Thanks again for your help!

@mxinden
Copy link
Member

mxinden commented Nov 26, 2025

@WaterWhisperer sorry for the delay here. Good catch. I didn't think of this before.

Instead of using SendMessage::send_data, how about using <SendMessage as SendStream>::send_data_atomic?

fn send_data_atomic(&mut self, conn: &mut Connection, buf: &[u8], now: Instant) -> Res<()> {
let data_frame = HFrame::Data {
len: buf.len() as u64,
};
self.stream.encode_with(|e| data_frame.encode(e));
self.stream.buffer(buf);
_ = self.stream.send_buffer(conn, now)?;
Ok(())
}

Our WebTransport implementation has the same use-case, namely to send its custom-framed control messages. Here is how WebTransport sends a closing frame:

if let Some(close_frame) = self.protocol.close_frame(error, message) {
self.control_stream_send
.send_data_atomic(conn, close_frame.as_ref(), now)?;
}

Does that help @WaterWhisperer?

@WaterWhisperer
Copy link
Contributor Author

@mxinden is busy at the moment, so please don't take his silence as us having lost interest here. Will take us a few more days to respond. Thanks again for your help!

@larseggert Thank you so much for your reassurance! I completely understand that everyone is busy with work and life (Just like I am busy with final exams :) ), and I really appreciate the time you and the team dedicate to maintaining this project. I've learned a lot from contributing to neqo and interacting with you all. Please take your time, and no worries at all about the delay.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@codspeed-hq
Copy link

codspeed-hq bot commented Nov 27, 2025

CodSpeed Performance Report

Merging #3122 will degrade performances by 10%

Comparing WaterWhisperer:capsule (50cf12d) with main (617e959)

Summary

⚡ 1 improvement
❌ 1 regression
✅ 21 untouched

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Mode Benchmark BASE HEAD Change
Simulation client 691.2 ms 768 ms -10%
Simulation wallclock-time 34.3 ms 32.9 ms +4.12%

qtrace!("[{self}] sent datagram via QUIC datagram");
Ok(())
}
Err(e) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is safe to fallback to capsules on all errors. I think we should only send capsules, if the remote did not negotiate the QUIC datagram extension, or if the remote set the max datagram size to 0.

https://www.rfc-editor.org/rfc/rfc9221.html#section-3

Note also this paragraph from the connect-udp draft:

If a UDP proxy is using QUIC DATAGRAM frames and it receives a UDP payload from the target that will not fit inside a QUIC DATAGRAM frame, the UDP proxy SHOULD NOT send the UDP payload in a DATAGRAM capsule, as that defeats the end-to-end unreliability characteristic that methods such as Datagram Packetization Layer PMTU Discovery (DPLPMTUD) depend on [DPLPMTUD].

https://www.rfc-editor.org/rfc/rfc9298.html#section-6.1

You might have to expose the remote_datagram_size function to neqo-http3 in order to check here whether it is 0 or not.

pub const fn remote_datagram_size(&self) -> u64 {
self.remote_datagram_size
}

@WaterWhisperer WaterWhisperer force-pushed the capsule branch 2 times, most recently from 4e7e90a to 50cf12d Compare November 27, 2025 11:31
Copilot AI review requested due to automatic review settings November 27, 2025 11:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.


You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.


You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

now: Instant,
) -> Res<bool> {
let capsule = Capsule::Datagram {
payload: Bytes::from(buf.to_vec()),
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary allocation: buf.to_vec() creates a copy of the buffer. Since Bytes can be constructed from a slice reference in many contexts, consider if this copy is necessary or if Bytes::copy_from_slice(buf) would be more idiomatic to make the copy explicit.

Suggested change
payload: Bytes::from(buf.to_vec()),
payload: Bytes::copy_from_slice(buf),

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@WaterWhisperer WaterWhisperer Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Bytes here is not the crate Bytes so it doesn't have copy_from_slice method, do we need to create a copy_from_slice method for Bytes in the codebase?

if let Some(payload) = data {
qdebug!("Decoded Datagram Capsule len={}", payload.len());
return Ok(Some(Self::Datagram {
payload: Bytes::from(payload.to_vec()),
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary allocation: payload.to_vec() creates a copy when payload is already a slice. Since Bytes::from() can take a Vec<u8> directly, this creates a redundant allocation. Consider using Bytes::copy_from_slice(payload) to make the copy operation explicit and idiomatic.

Suggested change
payload: Bytes::from(payload.to_vec()),
payload: Bytes::copy_from_slice(payload),

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@WaterWhisperer
Copy link
Contributor Author

Sorry for my late follow-up, I was busy with a few final exams some time ago

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a more in-depth look. Thanks for the continued work on this pull request.

@github-actions
Copy link
Contributor

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to main at 83e264c.

neqo-pr as clientneqo-pr as server
neqo-pr vs. aioquic: A 🚀L1
neqo-pr vs. go-x-net: A BP BA
neqo-pr vs. haproxy: A BP BA
neqo-pr vs. kwik: 🚀C1 BP BA
neqo-pr vs. linuxquic: A 🚀C1
neqo-pr vs. lsquic: L1 C1
neqo-pr vs. msquic: A L1 C1
neqo-pr vs. mvfst: A ⚠️L1 C1 ⚠️BA
neqo-pr vs. nginx: A ⚠️C1 BP BA
neqo-pr vs. ngtcp2: A 🚀L1 ⚠️C1 CM
neqo-pr vs. picoquic: A ⚠️C1
neqo-pr vs. quic-go: A 🚀C1
neqo-pr vs. quiche: A 🚀L1 ⚠️C1 BP BA
neqo-pr vs. quinn: A 🚀L1
neqo-pr vs. s2n-quic: A L1 C1 🚀BP BA CM
neqo-pr vs. tquic: S A L1 ⚠️C1 BP BA
neqo-pr vs. xquic: A L1
aioquic vs. neqo-pr: CM
go-x-net vs. neqo-pr: CM
kwik vs. neqo-pr: BP BA CM
linuxquic vs. neqo-pr: 🚀BP ⚠️BA
lsquic vs. neqo-pr: ⚠️L1 C1
msquic vs. neqo-pr: 🚀C1 CM
mvfst vs. neqo-pr: Z A L1 C1 CM
openssl vs. neqo-pr: LR M A CM
quic-go vs. neqo-pr: CM
quiche vs. neqo-pr: CM
quinn vs. neqo-pr: V2 CM
s2n-quic vs. neqo-pr: CM
tquic vs. neqo-pr: CM
xquic vs. neqo-pr: M CM
All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-pr as client

neqo-pr as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-pr as client

neqo-pr as server

@github-actions
Copy link
Contributor

Client/server transfer results

Performance differences relative to bece9bf.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ main Δ main
neqo-s2n-cubic 219.4 ± 4.4 212.9 227.2 145.8 ± 7.3 💚 -2.0 -0.9%

Table above only shows statistically significant changes. See all results below.

All results

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ main Δ main
google-google-nopacing 445.4 ± 4.6 437.5 458.7 71.8 ± 7.0
google-neqo-cubic 273.2 ± 4.6 264.8 288.9 117.1 ± 7.0 -0.6 -0.2%
msquic-msquic-nopacing 161.0 ± 34.0 135.0 404.8 198.7 ± 0.9
msquic-neqo-cubic 185.9 ± 20.9 146.8 286.4 172.1 ± 1.5 -6.5 -3.4%
neqo-google-cubic 744.2 ± 4.7 736.0 755.4 43.0 ± 6.8 0.9 0.1%
neqo-msquic-cubic 160.2 ± 4.4 151.4 168.2 199.8 ± 7.3 0.4 0.3%
neqo-neqo-cubic 96.9 ± 5.3 87.8 115.9 330.4 ± 6.0 0.2 0.3%
neqo-neqo-cubic-nopacing 96.0 ± 3.9 87.7 107.0 333.5 ± 8.2 -0.6 -0.6%
neqo-neqo-reno 96.3 ± 4.8 87.6 115.1 332.5 ± 6.7 -0.6 -0.6%
neqo-neqo-reno-nopacing 96.0 ± 4.2 86.7 104.7 333.4 ± 7.6 0.2 0.2%
neqo-quiche-cubic 190.9 ± 3.7 186.1 204.2 167.7 ± 8.6 0.6 0.3%
neqo-s2n-cubic 219.4 ± 4.4 212.9 227.2 145.8 ± 7.3 💚 -2.0 -0.9%
quiche-neqo-cubic 154.4 ± 7.2 143.1 195.0 207.3 ± 4.4 0.2 0.2%
quiche-quiche-nopacing 144.3 ± 4.7 136.0 161.2 221.8 ± 6.8
s2n-neqo-cubic 174.1 ± 4.5 165.6 186.1 183.8 ± 7.1 0.4 0.2%
s2n-s2n-nopacing 248.3 ± 25.8 231.0 370.1 128.9 ± 1.2

Download data for profiler.firefox.com or download performance comparison data.

@github-actions
Copy link
Contributor

Benchmark results

Significant performance differences relative to bece9bf.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💚 Performance has improved by -1.4772%.
       time:   [203.33 ms 203.81 ms 204.35 ms]
       thrpt:  [489.37 MiB/s 490.66 MiB/s 491.81 MiB/s]
change:
       time:   [-1.8059% -1.4772% -1.1479] (p = 0.00 < 0.05)
       thrpt:  [+1.1612% +1.4993% +1.8391]
       Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: 💚 Performance has improved by -2.5708%.
       time:   [278.58 ms 280.36 ms 282.20 ms]
       thrpt:  [35.436 Kelem/s 35.668 Kelem/s 35.896 Kelem/s]
change:
       time:   [-3.3846% -2.5708% -1.6624] (p = 0.00 < 0.05)
       thrpt:  [+1.6905% +2.6386% +3.5032]
       Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
All results
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💚 Performance has improved by -1.4772%.
       time:   [203.33 ms 203.81 ms 204.35 ms]
       thrpt:  [489.37 MiB/s 490.66 MiB/s 491.81 MiB/s]
change:
       time:   [-1.8059% -1.4772% -1.1479] (p = 0.00 < 0.05)
       thrpt:  [+1.1612% +1.4993% +1.8391]
       Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: 💚 Performance has improved by -2.5708%.
       time:   [278.58 ms 280.36 ms 282.20 ms]
       thrpt:  [35.436 Kelem/s 35.668 Kelem/s 35.896 Kelem/s]
change:
       time:   [-3.3846% -2.5708% -1.6624] (p = 0.00 < 0.05)
       thrpt:  [+1.6905% +2.6386% +3.5032]
       Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
       time:   [38.574 ms 38.746 ms 38.939 ms]
       thrpt:  [25.681   B/s 25.809   B/s 25.924   B/s]
change:
       time:   [-0.6406% -0.0002% +0.6609] (p = 1.00 > 0.05)
       thrpt:  [-0.6565% +0.0002% +0.6448]
       No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low mild
1 (1.00%) high mild
7 (7.00%) high severe
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: No change in performance detected.
       time:   [209.95 ms 210.26 ms 210.63 ms]
       thrpt:  [474.76 MiB/s 475.60 MiB/s 476.31 MiB/s]
change:
       time:   [-0.3872% -0.1838% +0.0471] (p = 0.09 > 0.05)
       thrpt:  [-0.0471% +0.1841% +0.3887]
       No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
decode 4096 bytes, mask ff: No change in performance detected.
       time:   [4.5063 µs 4.5146 µs 4.5242 µs]
       change: [-0.4197% -0.0619% +0.3612] (p = 0.77 > 0.05)
       No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [1.1611 ms 1.1627 ms 1.1644 ms]
       change: [-0.6179% +0.2720% +1.1887] (p = 0.55 > 0.05)
       No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) high mild
10 (10.00%) high severe
decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [5.7900 µs 5.7981 µs 5.8062 µs]
       change: [-0.2691% -0.0104% +0.2541] (p = 0.93 > 0.05)
       No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
decode 1048576 bytes, mask 7f: Change within noise threshold.
       time:   [1.4750 ms 1.4774 ms 1.4799 ms]
       change: [-0.8964% -0.6496% -0.3984] (p = 0.00 < 0.05)
       Change within noise threshold.
decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [5.5459 µs 5.5540 µs 5.5622 µs]
       change: [-0.3176% +0.1517% +0.6333] (p = 0.57 > 0.05)
       No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.4157 ms 1.4178 ms 1.4200 ms]
       change: [-0.5450% -0.1730% +0.1440] (p = 0.34 > 0.05)
       No change in performance detected.
1-streams/each-1000-bytes/wallclock-time: No change in performance detected.
       time:   [583.43 µs 586.33 µs 591.14 µs]
       change: [-0.8030% -0.1232% +0.7874] (p = 0.79 > 0.05)
       No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high severe
1-streams/each-1000-bytes/simulated-time: No change in performance detected.
       time:   [118.84 ms 119.06 ms 119.29 ms]
       thrpt:  [8.1868 KiB/s 8.2021 KiB/s 8.2174 KiB/s]
change:
       time:   [-0.3842% -0.1054% +0.1788] (p = 0.46 > 0.05)
       thrpt:  [-0.1785% +0.1055% +0.3857]
       No change in performance detected.
1000-streams/each-1-bytes/wallclock-time: Change within noise threshold.
       time:   [12.455 ms 12.495 ms 12.535 ms]
       change: [+0.5750% +1.0337% +1.5138] (p = 0.00 < 0.05)
       Change within noise threshold.
1000-streams/each-1-bytes/simulated-time: No change in performance detected.
       time:   [2.3300 s 2.3335 s 2.3371 s]
       thrpt:  [427.89   B/s 428.54   B/s 429.19   B/s]
change:
       time:   [-0.1636% +0.0439% +0.2554] (p = 0.68 > 0.05)
       thrpt:  [-0.2548% -0.0439% +0.1639]
       No change in performance detected.
1000-streams/each-1000-bytes/wallclock-time: No change in performance detected.
       time:   [50.457 ms 50.677 ms 51.007 ms]
       change: [-0.3663% +0.1543% +0.8349] (p = 0.67 > 0.05)
       No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
1000-streams/each-1000-bytes/simulated-time: No change in performance detected.
       time:   [15.904 s 16.143 s 16.385 s]
       thrpt:  [59.601 KiB/s 60.494 KiB/s 61.404 KiB/s]
change:
       time:   [-2.9559% -0.9114% +1.2738] (p = 0.40 > 0.05)
       thrpt:  [-1.2578% +0.9198% +3.0459]
       No change in performance detected.
coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [89.202 ns 89.529 ns 89.869 ns]
       change: [-0.5977% +0.5766% +2.7139] (p = 0.59 > 0.05)
       No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
7 (7.00%) high mild
3 (3.00%) high severe
coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [105.83 ns 106.17 ns 106.52 ns]
       change: [-0.4414% +0.0139% +0.5065] (p = 0.96 > 0.05)
       No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
1 (1.00%) high mild
13 (13.00%) high severe
coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [105.14 ns 105.47 ns 105.92 ns]
       change: [-0.6691% -0.2361% +0.2126] (p = 0.30 > 0.05)
       No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
5 (5.00%) high severe
coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [90.216 ns 90.336 ns 90.471 ns]
       change: [+0.0041% +0.4465% +0.9284] (p = 0.05 > 0.05)
       No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) high mild
4 (4.00%) high severe
RxStreamOrderer::inbound_frame(): No change in performance detected.
       time:   [108.49 ms 108.66 ms 108.96 ms]
       change: [-0.3594% -0.0422% +0.2915] (p = 0.83 > 0.05)
       No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
9 (9.00%) low mild
1 (1.00%) high severe
sent::Packets::take_ranges: No change in performance detected.
       time:   [4.4552 µs 4.5793 µs 4.7100 µs]
       change: [-4.6729% -1.4920% +1.9998] (p = 0.39 > 0.05)
       No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
transfer/pacing-false/varying-seeds/wallclock-time/run: Change within noise threshold.
       time:   [23.765 ms 23.786 ms 23.810 ms]
       change: [-1.1466% -1.0419% -0.9278] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
transfer/pacing-false/varying-seeds/simulated-time/run: No change in performance detected.
       time:   [23.941 s 23.941 s 23.941 s]
       thrpt:  [171.09 KiB/s 171.09 KiB/s 171.09 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000]
       No change in performance detected.
transfer/pacing-true/varying-seeds/wallclock-time/run: Change within noise threshold.
       time:   [23.938 ms 23.970 ms 24.018 ms]
       change: [-1.2660% -1.0953% -0.8629] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
transfer/pacing-true/varying-seeds/simulated-time/run: No change in performance detected.
       time:   [23.676 s 23.676 s 23.676 s]
       thrpt:  [173.01 KiB/s 173.01 KiB/s 173.01 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000]
       No change in performance detected.
transfer/pacing-false/same-seed/wallclock-time/run: No change in performance detected.
       time:   [23.838 ms 23.853 ms 23.869 ms]
       change: [-0.2549% -0.0375% +0.1164] (p = 0.75 > 0.05)
       No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
transfer/pacing-false/same-seed/simulated-time/run: No change in performance detected.
       time:   [23.941 s 23.941 s 23.941 s]
       thrpt:  [171.09 KiB/s 171.09 KiB/s 171.09 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000]
       No change in performance detected.
transfer/pacing-true/same-seed/wallclock-time/run: Change within noise threshold.
       time:   [24.206 ms 24.226 ms 24.247 ms]
       change: [-0.2936% -0.1788% -0.0654] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
transfer/pacing-true/same-seed/simulated-time/run: No change in performance detected.
       time:   [23.676 s 23.676 s 23.676 s]
       thrpt:  [173.01 KiB/s 173.01 KiB/s 173.01 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000]
       No change in performance detected.

Download data for profiler.firefox.com or download performance comparison data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should Neqo support the HTTP DATAGRAM Capsule?

3 participants