Skip to content

Reduce memory footprint of HTTP/2 connections #112719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 6, 2025

Conversation

MihaZupan
Copy link
Member

Since the logic is shared, this applies to both SocketsHttpHandler and Kestrel.
Possibly addresses dotnet/aspnetcore#60313
cc: @AlgorithmsAreCool, @halter73, @JamesNK for the server side

HPackDecoder allocates 4 buffers:

  • 4 KB _stringOctets, 4 KB _headerNameOctets, 4 KB _headerValueOctets
  • 3 KB DynamicTable

IMO these are oversized for initial sizes:

  • names are way smaller in practice
  • 4 KB is quite large for a single value, except maybe for large cookies
  • if a peer doesn't use dynamic header indexing (e.g. HttpClient), the 3 KB buffer will be unused

I changed these to start very small, letting them grow as needed.
We already have resizing logic, I only modified DynamicTable to also support starting with a non-max-size buffer (it already could resize afterwards).

Looking at HttpClient talking to Kestrel, I expect this should save around

  • 4-11 KB for HttpClient (at least the name buffer, likely a chunk of the value)
    • For cases with few requests (e.g. single long-lived stream) this could be ~14 KB since DynamicTable will stay small
  • 7-14 KB for Kestrel (at least the name and dynamic table, likely a chunk of the value)

@MihaZupan MihaZupan added this to the 10.0.0 milestone Feb 20, 2025
@MihaZupan MihaZupan requested a review from a team February 20, 2025 00:22
@MihaZupan MihaZupan self-assigned this Feb 20, 2025
@Copilot Copilot AI review requested due to automatic review settings February 20, 2025 00:22
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

public const int DefaultStringOctetsSize = 4096;

// This is the initial size. Buffers will be dynamically resized as needed.
public const int DefaultStringOctetsSize = 32;
Copy link
Preview

Copilot AI Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reduced default string octets size of 32 may lead to more frequent buffer resizes if header values typically exceed this length; please ensure that the change is backed by performance benchmarks for realistic workloads.

Copilot uses AI. Check for mistakes.

Copy link
Member Author

@MihaZupan MihaZupan Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could see more intermediate smaller allocations, but that is offset by the fact we're allocating less overall on average.
E.g. we may allocate more for value buffers in the worst case, but we'll allocate less for names. In either case these are temporary per-connection costs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have data on expected string sizes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that hugely varies by server, but I can offer some educated guesses.
The largest headers tend to be used when the client is a browser - Cookie/Set-Cookie, content-security-policy, Report-To, ..., which can reach several KB.

From what I've seen, API endpoints typically use fewer and much shorter headers.
E.g. from a quick test of traffic on my machine over a few minutes:

Largest single header value returned by various APIs
discord.com                                        4198
store.fastly.steamstatic.com                       1761
steamcommunity.com                                 1484
assets.msn.com                                     1236
store.steampowered.com                             1184 // ^ These are essentially browsers
login.microsoftonline.com                          513
ocws.officeapps.live.com                           424
roaming.officeapps.live.com                        416
odc.officeapps.live.com                            393
res.public.onecdn.static.microsoft                 334
api.github.com                                     323
gateway.discord.gg                                 267
updates.discord.com                                259
cdn.discordapp.com                                 259
latency.discord.media                              257
update.googleapis.com                              213
edge.microsoft.com                                 205
arc.msn.com                                        193
api.spotify.com                                    150
status.discord.com                                 145
graph.microsoft.com                                126
app.vssps.visualstudio.com                         117
api.vstsusers.visualstudio.com                     117
aka.ms                                             94
steam-chat.com                                     91
go.microsoft.com                                   88
default.exp-tas.com                                86
management.azure.com                               86
shared.fastly.steamstatic.com                      72
avatars.githubusercontent.com                      66
inputsuggestions.msdxcdn.microsoft.com             65
static.edge.microsoftapp.net                       65
global.rel.tunnels.api.visualstudio.com            61
mihubot.xyz                                        49
targetednotifications-tm.trafficmanager.net        49
cdn.steamstatic.com                                46
binaries.templates.cdn.office.net                  44
client-update.fastly.steamstatic.com               40
settings.visualstudio.microsoft.com                36
api.enterprise.githubcopilot.com                   36
login.live.com                                     36
telemetry.visualstudio.microsoft.com               36
telemetry.enterprise.githubcopilot.com             35
cdn.fastly.steamstatic.com                         29
avatars.steamstatic.com                            29
crash.steampowered.com                             29
self.events.data.microsoft.com                     29
api.steampowered.com                               29
steamcommunity-a.akamaihd.net                      29
vortex.data.microsoft.com                          29
metadata.templates.cdn.office.net                  29
mobile.events.data.microsoft.com                   29
cmp1-fra2.steamserver.net                          29
dealer.spotify.com                                 28

Header names are practically all tiny to no one's surprise.

When talking about HPackDecoder's buffers specifically, it's worth noting that there are effectively two value buffers in use: _headerNameOctets, _headerValueOctets.
The former will be used if a header is split and read across multiple packets, or if huffman encoding was used (HttpClient/Kestel don't use huffman rn). In a bunch of cases it is therefore completely unused.

@JamesNK
Copy link
Member

JamesNK commented Feb 20, 2025

@AlgorithmsAreCool
Copy link
Contributor

This would go a long way towards addressing dotnet/aspnetcore#60313. Probably reducing about 25% of memory footprint per connection.

var newBuffer = new HeaderField[maxSize / HeaderField.RfcOverhead];
Debug.Assert(_count + 1 <= _maxSize / HeaderField.RfcOverhead);

int newBufferSize = Math.Min(Math.Max(4, _buffer.Length * 2), _maxSize / HeaderField.RfcOverhead);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone is using the dynamic table feature, is it likely they'll only have 4 items in the table? Maybe I misunderstand typical use, but that seems small.

Copy link
Member Author

@MihaZupan MihaZupan Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll heavily depend on the use case, but you're right, 4 is likely to be exceeded, it should likely start larger.
E.g. if you sent a single request, this might be the number of response headers for that request.
If you're hitting an API where response headers look the same, it might never grow after that.
If you were to use HttpClient as a crawler, then you're practically guaranteed to either use 0 if the server doesn't use it, or maxSize if it does.

If we wanted to, we could change this to something like newSize = _buffer.Length == 0 ? 16 : maxCapacity.
I just changed the initial capacity to 16 for now.

Copy link
Member

@ManickaP ManickaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just few small questions.
:shipit:

var entry = new HeaderField(staticTableIndex, name, value);
_buffer[_insertIndex] = entry;

if (++_insertIndex == _buffer.Length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No feedback, just I'm curious why change the modulo to this? Is there some advantage to have it like this?

Copy link
Member Author

@MihaZupan MihaZupan Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#112719 (comment)

It's a microoptimization since modulo is so expensive (relatively) that a predicted branch can be cheaper.
We use a similar trick in Queue for example

// Increments the index wrapping it if necessary.
private void MoveNext(ref int index)
{
// It is tempting to use the remainder operator here but it is actually much slower
// than a simple comparison and a rarely taken branch.
// JIT produces better code than with ternary operator ?:
int tmp = index + 1;
if (tmp == _array.Length)
{
tmp = 0;
}
index = tmp;
}

@MihaZupan MihaZupan merged commit d5925b6 into dotnet:main Mar 6, 2025
82 of 84 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Apr 6, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants