Skip to content

upstream: reject and gracefully handle invalid bind network namespace#45721

Closed
bpalermo wants to merge 1 commit into
envoyproxy:mainfrom
bpalermo:bug/upstream-bind-netns-validation
Closed

upstream: reject and gracefully handle invalid bind network namespace#45721
bpalermo wants to merge 1 commit into
envoyproxy:mainfrom
bpalermo:bug/upstream-bind-netns-validation

Conversation

@bpalermo

Copy link
Copy Markdown

Commit Message

upstream: reject and gracefully handle invalid bind network namespace

Additional Description

Binding an upstream connection to a configured Linux network namespace
(SocketAddress.network_namespace_filepath) could crash Envoy when the namespace did
not exist: DispatcherImpl::createClientConnection returns nullptr on netns failure,
and the result was dereferenced unconditionally in HostImplBase::createConnection. The
listener (LDS) path already rejected such configs gracefully; only the cluster (CDS)
upstream-bind path crashed.

This fixes the crash from two angles:

  • Config rejection (CDS): a cluster whose upstream_bind_config source address
    references a network namespace that cannot be opened is now rejected at config
    admission time, mirroring the listener behavior. Guarded by the runtime feature
    envoy.reloadable_features.reject_invalid_bind_network_namespace so it can be reverted
    on a live instance.
  • Graceful runtime failure: if the namespace becomes unavailable after admission, the
    host connection factory returns a null connection and the TCP/HTTP1/HTTP2/mixed
    connection pools surface this as a FailedToCreateConnection pool failure — matching how
    the HTTP/3 pool already behaves — instead of crashing.

Risk Level

Low — the new rejection is runtime-guarded; the runtime-safety change converts a crash into
an existing graceful failure path.

Testing

Unit tests added: Network::Utility::validateNetworkNamespace (success/open-failure), and
CDS cluster rejection with the runtime guard enabled and disabled.

Docs Changes

N/A

Release Notes

Added a bug-fix changelog fragment.

Platform Specific Features

Network namespace binding is Linux-only; the new validation and code paths are
#if defined(__linux__)-guarded.

Binding an upstream connection to a configured Linux network namespace
(`SocketAddress.network_namespace_filepath`) could crash Envoy when the
namespace did not exist: `createClientConnection` returned nullptr and the
result was dereferenced unconditionally in `HostImplBase::createConnection`.

This change addresses the crash from two angles:

- A cluster whose `upstream_bind_config` source address references a
  network namespace that cannot be opened is now rejected at config
  admission time, mirroring the existing listener (LDS) behavior. This is
  guarded by the runtime feature
  `envoy.reloadable_features.reject_invalid_bind_network_namespace` so it
  can be reverted on a live instance.

- If the namespace becomes unavailable at runtime (after admission), the
  upstream connection now fails gracefully instead of crashing. The host
  connection factory returns a null connection, and the TCP/HTTP1/HTTP2/
  mixed connection pools surface this as a `FailedToCreateConnection`
  pool failure, matching how the HTTP/3 pool already behaves.

Signed-off-by: Bruno Palermo <b@palermo.dev>
@repokitteh-read-only

Copy link
Copy Markdown

Hi @bpalermo, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #45721 was opened by bpalermo.

see: more, trace.

@bpalermo bpalermo requested a deployment to external-contributors June 20, 2026 15:03 — with GitHub Actions Waiting
@repokitteh-read-only

Copy link
Copy Markdown

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).

🐱

Caused by: #45721 was opened by bpalermo.

see: more, trace.

@yanavlasov

Copy link
Copy Markdown
Contributor

Thanks for this improvement. Would it be possible to split this PR into two to deal with different concerns: namespace validation at config ingestion time and preventing hard crash when socket can not be bound to a namespace. It will make review easier.

For config validation, I do not think runtime guard is the right approach, since it is temporary in nature (removed after 6 months). I think there are scenarios where namespace might not exist at the time when Envoy parses cluster config, so NACKing it would be wrong at this point. But may exist later when traffic is flowing. Listener socket bind config is different, since it needs to be bound immediately when config is ingested, so it makes sense to check it right away. I wonder if this should be an bool option in the bind config.

/wait-any

@bpalermo

bpalermo commented Jul 4, 2026

Copy link
Copy Markdown
Author

@yanavlasov thanks for the review — that all makes sense. I've split this PR in two:

  • upstream: gracefully handle upstream connection creation failure #45975 — the crash fix only: HostImplBase::createConnection now surfaces a null connection and the TCP/HTTP1/HTTP2/mixed pools turn it into a FailedToCreateConnection pool failure, matching the HTTP/3 pool. No runtime guard, since it only converts a crash into the existing graceful failure path.
  • upstream: add opt-in validation of bind config network namespaces #45976 — the config-time validation, reworked per your suggestion: instead of a runtime guard it adds an opt-in BindConfig.validate_network_namespaces bool (default false), so eager validation is only performed when the user explicitly asks for it. That keeps the "namespace appears after config load" scenario working by default. Happy to adjust the field name/placement if you'd prefer something different.

Closing this PR in favor of the two above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants