-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Title: Expose detailed TLS certificate validation failure reasons in Envoy access logs (beyond generic “CERTIFICATE_VERIFY_FAILED”)
Description:
As an administrator, I would like more detailed insights into TLS handshake failures—particularly during certificate validation—so I can better troubleshoot with upstream service owners.
Currently, when Envoy fails to connect due to a certificate validation issue, the logs and error messages are too generic to act on. For example, we see the following in logs:
upstream connect error or disconnect/reset before headers. reset reason: remote connection failure, transport failure reason: TLS_error:|268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED:TLS_error_end
There is no actionable detail to help identify why the validation failed—e.g., missing SAN, expired cert, wrong issuer, etc.
Expose more specific certificate validation failure details in the logs—preferably through access log metadata, error metadata, or a specific log field—without requiring debug-level logging.
For example:
• “SAN does not match hostname”
• “Certificate expired”
• “Unknown CA”
• “Issuer not trusted”
• etc.
Even exposing a string like the one returned by x509_verify_cert_error_string() would be immensely helpful.
Implementation
From reviewing the Envoy source, this seems feasible:
• The method CertValidator::doVerifyCertChain already calls x509_verify_cert_error_string() during validation.
↳ source/common/tls/cert_validator/default_validator.cc
• The result of this call could be logged or attached to the upstream connection metadata for inclusion in access logs or error tracking.
• Ideally, this would be log-only (not returned to clients) and available under normal logging levels (e.g., info or warn), avoiding the need for full debug logging.
[optional Relevant Links:]
envoy/source/common/tls/cert_validator/default_validator.cc at 38b47e7adae0ff4c4a94c1d841fed0326a26f5b6 · envoyproxy/envoy