Description
Title: Segmentation fault in Envoy 1.32.1
Description:
Envoy should not crash when one of the Redis cluster replicas is unreachable by DNS query. This issue has been triaged by Envoy Security.
Repro steps:
I can consistently reproduce by having a cluster configured with static service discovery and by having an unresolvable by DNS query Redis replica.
Config:
connect_timeout: 0.25s
dns_lookup_family: V4_ONLY
load_assignment:
cluster_name: {{ instance.name }}
endpoints:
{% for endpoint in instance.endpoints %}
- lb_endpoints:
{% for port in endpoint.ports %}
- endpoint:
address:
socket_address:
address: {{ endpoint.address }}
port_value: {{ port }}
{% endfor %}
{% endfor %}
cluster_type:
name: envoy.clusters.redis
typed_config:
"@type": type.googleapis.com/google.protobuf.Struct
value:
cluster_refresh_rate: 5s
cluster_refresh_timeout: 1s
Logs:
Include the access logs and the Envoy logs.
Note: If there are privacy concerns, sanitize the data prior to
sharing.
Call Stack:
thread #1, name = 'envoy', stop reason = signal SIGSEGV
- frame #0: 0x000025773f135800
frame Initial import #1: 0x0000564c0c56e67b envoyEnvoy::Extensions::Clusters::Redis::RedisCluster::onClusterSlotUpdate(std::__1::shared_ptr<std::__1::vector<Envoy::Extensions::Clusters::Redis::ClusterSlot, std::__1::allocator<Envoy::Extensions::Clusters::Redis::ClusterSlot> > >&&) + 1435 frame #2: 0x0000564c0c57c5ee envoy
std::__1::__function::__func<Envoy::Extensions::Clusters::Redis::RedisCluster::RedisDiscoverySession::resolveReplicas(std::__1::shared_ptr<std::__1::vector<Envoy::Extensions::Clusters::Redis::ClusterSlot, std::__1::allocatorEnvoy::Extensions::Clusters::Redis::ClusterSlot > >, unsigned long, std::__1::shared_ptr)::$_11, std::__1::allocator<Envoy::Extensions::Clusters::Redis::RedisCluster::RedisDiscoverySession::resolveReplicas(std::__1::shared_ptr<std::__1::vector<Envoy::Extensions::Clusters::Redis::ClusterSlot, std::__1::allocatorEnvoy::Extensions::Clusters::Redis::ClusterSlot > >, unsigned long, std::__1::shared_ptr)::$_11>, void (Envoy::Network::DnsResolver::ResolutionStatus, std::__1::basic_string_view<char, std::__1::char_traits >, std::__1::list<Envoy::Network::DnsResponse, std::__1::allocatorEnvoy::Network::DnsResponse >&&)>::operator()(Envoy::Network::DnsResolver::ResolutionStatus&&, std::__1::basic_string_view<char, std::__1::char_traits >&&, std::__1::list<Envoy::Network::DnsResponse, std::__1::allocatorEnvoy::Network::DnsResponse >&&) + 958
frame network filter: fix upstream host storage #3: 0x0000564c0dd9af90 envoyEnvoy::Network::DnsResolverImpl::PendingResolution::finishResolve() + 2544 frame #4: 0x0000564c0dd99cfa envoy
Envoy::Network::DnsResolverImpl::AddrInfoPendingResolution::onAresGetAddrInfoCallback(int, int, ares_addrinfo*) + 5370
frame More details around failure reasons. #5: 0x0000564c0dda6b3c envoyend_hquery + 140 frame #6: 0x0000564c0ddaeb73 envoy
qcallback + 19
frame read --build-id from root cmake project during linking #7: 0x0000564c0dda52e1 envoyares_destroy + 97 frame #8: 0x0000564c0dd979a2 envoy
Envoy::Network::DnsResolverImpl::~DnsResolverImpl() + 50
frame add x-envoy-upstream-rq-per-try-timeout-ms router header option #9: 0x0000564c0df95da5 envoyEnvoy::Server::InstanceBase::~InstanceBase() + 1669 frame #10: 0x0000564c0df8e725 envoy
Envoy::Server::InstanceImpl::~InstanceImpl() + 101
frame ci: do asan build #11: 0x0000564c0df482c0 envoyEnvoy::StrippedMainBase::~StrippedMainBase() + 80 frame #12: 0x0000564c0df49874 envoy
std::__1::default_deleteEnvoy::MainCommon::operator()(Envoy::MainCommon*) const + 84
frame docs: deployment types #13: 0x0000564c0df48c2d envoyEnvoy::MainCommon::main(int, char**, std::__1::function<void (Envoy::Server::Instance&)>) + 157 frame #14: 0x0000564c0c4ec14c envoy
main + 44
frame docs: misc #15: 0x00007f72e9242d90 libc.so.6___lldb_unnamed_symbol118$$libc.so.6 + 2192 frame #16: 0x0000564c0c4ec000 envoy thread #2, stop reason = signal 0 frame #0: 0x00007f72e933788d libc.so.6
__libc_ifunc_impl_list + 5677
thread network filter: fix upstream host storage #3, stop reason = signal 0
frame #0: 0x00007f72e933788d libc.so.6__libc_ifunc_impl_list + 5677 thread #4, stop reason = signal 0 frame #0: 0x00007f72e933788d libc.so.6
__libc_ifunc_impl_list + 5677
thread More details around failure reasons. #5, stop reason = signal 0
frame #0: 0x00007f72e933788d libc.so.6`__libc_ifunc_impl_list + 5677
Activity