Skip to content

WIP: tls: scope upstream session cache by sni#45982

Draft
dio wants to merge 3 commits into
envoyproxy:mainfrom
dio:dio/tls-sni-session-cache
Draft

WIP: tls: scope upstream session cache by sni#45982
dio wants to merge 3 commits into
envoyproxy:mainfrom
dio:dio/tls-sni-session-cache

Conversation

@dio

@dio dio commented Jul 5, 2026

Copy link
Copy Markdown
Member

Commit Message: tls: scope upstream session cache by sni

Additional Description:
This fixes upstream client TLS session cache keying when one ClientContextImpl is used for multiple logical upstream hosts. Previously, client sessions were cached at the context level, so a session learned while connecting with one effective SNI could be offered on a later connection using a different effective SNI.

This is treated as a bug fix because the old cache key was broader than the TLS identity used for the connection. Session resumption should be scoped to the logical server identity represented by the effective SNI, not just to the Envoy client context object that created the connection.
If reviewers consider the max_session_keys semantics change more prominent than the cache-keying fix, I can move the changelog entry to minor behavior changes.

The cache is now scoped by effective SNI using the same precedence as ClientHello SNI selection:

  1. transport socket server_name override
  2. auto_host_sni host hostname
  3. static upstream TLS sni

For example, with a single upstream TLS context:

  1. connect to a.example.com
    • full handshake
    • cache stores the session under SNI bucket a.example.com
  2. connect to b.example.com
    • does not use the a.example.com cached session
    • full handshake
    • cache stores the session under SNI bucket b.example.com
  3. connect to a.example.com again
    • may reuse the session from the a.example.com bucket

The existing max_session_keys setting now limits sessions within each effective SNI bucket. A fixed internal cap bounds the number of distinct SNI buckets. When the distinct-SNI cap is exceeded, Envoy evicts the least-recently-used SNI bucket, dropping only its cached sessions. Future connections for the evicted SNI still work normally; they perform a full TLS handshake and may populate the cache again.

This behavior is guarded by the default-on runtime feature envoy.reloadable_features.scope_upstream_tls_session_cache_by_sni. Disabling the runtime feature restores the previous context-wide session cache behavior as a rollback path while the guard exists.

No dynamic module ABI or config API changes are included in this PR.

AI assistance was used to help implement and test this change. I reviewed and understand the submitted code.

Risk Level: Medium

Testing:

  • bazel test --config=clang -c dbg //test/common/tls:ssl_socket_test --test_filter=*ClientSessionCache*
  • PATH=/tmp/envoy-format-venv/bin:/Library/Developer/CommandLineTools/usr/bin:/opt/homebrew/bin:$PATH ASPELL_DICT=tools/spelling/spelling_dictionary.txt tools/local_fix_format.sh -skip-bazel -main
  • git diff --check

Docs Changes: Added changelog fragment.

Release Notes: Added bug fix note for upstream TLS session cache scoping by SNI.

Platform Specific Features: N/A

Issues:
#45962

dio added 2 commits July 4, 2026 22:11
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
@repokitteh-read-only

Copy link
Copy Markdown

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #45982 was opened by dio.

see: more, trace.

@dio dio changed the title tls: scope upstream session cache by sni WIP: tls: scope upstream session cache by sni Jul 5, 2026
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant