-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Description
If you are reporting any crash or any potential security issue, do not
open an issue in this repo. Please report the issue via emailing
envoy-security@googlegroups.com where the issue will be triaged appropriately.
Title: MULTI EXEC causes crash during mirroring mode
Description:
I've found 2 issues, one of which will cause a segfault and crash Envoy, related to using MULTI EXEC in the redisproxy when using a mirror policy. When using MULTI EXEC in mirrored mode with reads mirrored, Envoy will crash. When using MULTI EXEC with reads not mirrored, the transaction will not be executed on the mirrored node(s).
Repro steps:
The way I reproduced this issue was to set up Envoy in a docker-compose along with a redis standalone node and a redis cluster. The redisproxy was configured with the standalone node as the primary with a request_mirror_policy to the redis cluster. I tested this with both exclude_read_commands:true and exclude_read_commands:false. The behavior of the Envoy proxy differed in both cases causing a crash when the reads were mirrored and simply not executing on the cluster when reads were not mirrored.
For testing this, I sent a couple of redis commands to the proxy:
SET baz 100
MULTI
SET baz 200
EXEC
Config:
request_mirror_policy set
Logs:
For exclude_read_commands:true, the debug logs looked like:
[36mproxy_1 | [0m [2023-03-24 15:13:30.577][17][debug][redis] [source/extensions/filters/network/redis_proxy/command_splitter_impl.cc:663] splitting '["set", "baz", "100"]'
[36mproxy_1 | [0m [2023-03-24 15:13:30.578][17][debug][connection] [source/common/network/connection_impl.cc:939] [C8] connecting to 172.20.0.20:6379
[36mproxy_1 | [0m [2023-03-24 15:13:30.578][17][debug][connection] [source/common/network/connection_impl.cc:958] [C8] connection in progress
[36mproxy_1 | [0m [2023-03-24 15:13:30.578][17][debug][connection] [source/common/network/connection_impl.cc:939] [C9] connecting to 172.20.0.31:6373
[36mproxy_1 | [0m [2023-03-24 15:13:30.579][17][debug][connection] [source/common/network/connection_impl.cc:958] [C9] connection in progress
[36mproxy_1 | [0m [2023-03-24 15:13:30.580][17][debug][connection] [source/common/network/connection_impl.cc:688] [C8] connected
[36mproxy_1 | [0m [2023-03-24 15:13:30.580][17][debug][connection] [source/common/network/connection_impl.cc:688] [C9] connected
[36mproxy_1 | [0m [2023-03-24 15:13:33.025][17][debug][redis] [source/extensions/filters/network/redis_proxy/command_splitter_impl.cc:663] splitting '["multi"]'
[36mproxy_1 | [0m [2023-03-24 15:13:36.953][17][debug][redis] [source/extensions/filters/network/redis_proxy/command_splitter_impl.cc:663] splitting '["set", "baz", "200"]'
[36mproxy_1 | [0m [2023-03-24 15:13:36.953][17][debug][connection] [source/common/network/connection_impl.cc:939] [C10] connecting to 172.20.0.20:6379
[36mproxy_1 | [0m [2023-03-24 15:13:36.953][17][debug][connection] [source/common/network/connection_impl.cc:958] [C10] connection in progress
[36mproxy_1 | [0m [2023-03-24 15:13:36.954][17][debug][connection] [source/common/network/connection_impl.cc:688] [C10] connected
[36mproxy_1 | [0m [2023-03-24 15:13:38.626][17][debug][redis] [source/extensions/filters/network/redis_proxy/command_splitter_impl.cc:663] splitting '["exec"]'
[36mproxy_1 | [0m [2023-03-24 15:13:38.629][17][debug][connection] [source/common/network/connection_impl.cc:139] [C10] closing data_to_write=0 type=1
[36mproxy_1 | [0m [2023-03-24 15:13:38.630][17][debug][connection] [source/common/network/connection_impl.cc:250] [C10] closing socket: 1
[36mproxy_1 | [0m [2023-03-24 15:13:41.876][17][debug][connection] [source/common/network/connection_impl.cc:656] [C6] remote close
[36mproxy_1 | [0m [2023-03-24 15:13:41.876][17][debug][connection] [source/common/network/connection_impl.cc:250] [C6] closing socket: 0
For this case, it appears that the proxy did not try to send the MULTI EXEC to the cluster (172.20.0.31:6373) as it did for the simple SET command. From a quick look at the source code in https://github.com/envoyproxy/envoy/blob/main/source/extensions/filters/network/common/redis/supported_commands.h#L86-L95, I don't think that MULTI is being treated as a write command and is excluded from the mirroring.
For exclude_read_commands:false, the debug logs looked like:
[36mproxy_1 | [0m [2023-03-24 01:26:22.269][16][debug][redis] [source/extensions/filters/network/redis_proxy/command_splitter_impl.cc:663] splitting '["set", "baz", "100"]'
[36mproxy_1 | [0m [2023-03-24 01:26:22.270][16][debug][connection] [source/common/network/connection_impl.cc:939] [C6] connecting to 172.20.0.20:6379
[36mproxy_1 | [0m [2023-03-24 01:26:22.272][16][debug][connection] [source/common/network/connection_impl.cc:958] [C6] connection in progress
[36mproxy_1 | [0m [2023-03-24 01:26:22.274][16][debug][connection] [source/common/network/connection_impl.cc:939] [C7] connecting to 172.20.0.31:6373
[36mproxy_1 | [0m [2023-03-24 01:26:22.274][16][debug][connection] [source/common/network/connection_impl.cc:958] [C7] connection in progress
[36mproxy_1 | [0m [2023-03-24 01:26:22.274][16][debug][connection] [source/common/network/connection_impl.cc:688] [C6] connected
[36mproxy_1 | [0m [2023-03-24 01:26:22.275][16][debug][connection] [source/common/network/connection_impl.cc:688] [C7] connected
[36mproxy_1 | [0m [2023-03-24 01:26:32.267][1][debug][connection] [source/common/network/connection_impl.cc:939] [C8] connecting to 172.20.0.32:6374
[36mproxy_1 | [0m [2023-03-24 01:26:32.268][1][debug][connection] [source/common/network/connection_impl.cc:958] [C8] connection in progress
[36mproxy_1 | [0m [2023-03-24 01:26:32.268][1][debug][connection] [source/common/network/connection_impl.cc:688] [C8] connected
[36mproxy_1 | [0m [2023-03-24 01:26:51.155][16][debug][redis] [source/extensions/filters/network/redis_proxy/command_splitter_impl.cc:663] splitting '["multi"]'
[36mproxy_1 | [0m [2023-03-24 01:26:56.086][16][debug][redis] [source/extensions/filters/network/redis_proxy/command_splitter_impl.cc:663] splitting '["set", "baz", "200"]'
[36mproxy_1 | [0m [2023-03-24 01:26:56.087][16][debug][connection] [source/common/network/connection_impl.cc:939] [C9] connecting to 172.20.0.20:6379
[36mproxy_1 | [0m [2023-03-24 01:26:56.088][16][debug][connection] [source/common/network/connection_impl.cc:958] [C9] connection in progress
[36mproxy_1 | [0m [2023-03-24 01:26:56.089][16][debug][connection] [source/common/network/connection_impl.cc:688] [C9] connected
[36mproxy_1 | [0m [2023-03-24 01:26:57.283][1][debug][connection] [source/common/network/connection_impl.cc:939] [C10] connecting to 172.20.0.33:6375
[36mproxy_1 | [0m [2023-03-24 01:26:57.284][1][debug][connection] [source/common/network/connection_impl.cc:958] [C10] connection in progress
[36mproxy_1 | [0m [2023-03-24 01:26:57.289][1][debug][connection] [source/common/network/connection_impl.cc:688] [C10] connected
[36mproxy_1 | [0m [2023-03-24 01:26:57.890][16][debug][redis] [source/extensions/filters/network/redis_proxy/command_splitter_impl.cc:663] splitting '["exec"]'
[36mproxy_1 | [0m [2023-03-24 01:26:57.894][16][debug][connection] [source/common/network/connection_impl.cc:139] [C9] closing data_to_write=0 type=1
[36mproxy_1 | [0m [2023-03-24 01:26:57.894][16][debug][connection] [source/common/network/connection_impl.cc:250] [C9] closing socket: 1
[36mproxy_1 | [0m [2023-03-24 01:26:57.894][16][critical][assert] [source/common/stats/symbol_table.cc:487] assert failure: search != decode_map_.end(). Details: no such symbol
[36mproxy_1 | [0m [2023-03-24 01:26:57.895][16][critical][backtrace] [./source/server/backtrace.h:104] Caught Aborted, suspect faulting address 0x6500000001
[36mproxy_1 | [0m [2023-03-24 01:26:57.895][16][critical][backtrace] [./source/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[36mproxy_1 | [0m [2023-03-24 01:26:57.895][16][critical][backtrace] [./source/server/backtrace.h:92] Envoy version: 9184b84/1.25.0/Clean/RELEASE/BoringSSL
[36mproxy_1 | [0m [2023-03-24 01:26:57.895][16][critical][backtrace] [./source/server/backtrace.h:96] #0: kernel_rt_sigreturn [0xffff99940870]
[36mproxy_1 | [0m [2023-03-24 01:26:57.896][16][critical][backtrace] [./source/server/backtrace.h:96] #1: abort [0xffff996b5aac]
[36mproxy_1 | [0m [2023-03-24 01:26:57.896][16][critical][backtrace] [./source/server/backtrace.h:98] #2: [0xaaaac109eb5c]
[36mproxy_1 | [0m [2023-03-24 01:26:57.896][16][critical][backtrace] [./source/server/backtrace.h:98] #3: [0xaaaac10a0734]
[36mproxy_1 | [0m [2023-03-24 01:26:57.896][16][critical][backtrace] [./source/server/backtrace.h:98] #4: [0xaaaac109cc10]
[36mproxy_1 | [0m [2023-03-24 01:26:57.896][16][critical][backtrace] [./source/server/backtrace.h:98] #5: [0xaaaac109cf80]
[36mproxy_1 | [0m [2023-03-24 01:26:57.897][16][critical][backtrace] [./source/server/backtrace.h:98] #6: [0xaaaac109d9e8]
[36mproxy_1 | [0m [2023-03-24 01:26:57.897][16][critical][backtrace] [./source/server/backtrace.h:98] #7: [0xaaaac079566c]
[36mproxy_1 | [0m [2023-03-24 01:26:57.897][16][critical][backtrace] [./source/server/backtrace.h:98] #8: [0xaaaac0793c18]
[36mproxy_1 | [0m [2023-03-24 01:26:57.898][16][critical][backtrace] [./source/server/backtrace.h:98] #9: [0xaaaac0793790]
[36mproxy_1 | [0m [2023-03-24 01:26:57.898][16][critical][backtrace] [./source/server/backtrace.h:98] #10: [0xaaaac109bf14]
[36mproxy_1 | [0m [2023-03-24 01:26:57.904][16][critical][backtrace] [./source/server/backtrace.h:98] #11: [0xaaaabf7f20e4]
[36mproxy_1 | [0m [2023-03-24 01:26:57.905][16][critical][backtrace] [./source/server/backtrace.h:98] #12: [0xaaaabf7e970c]
[36mproxy_1 | [0m [2023-03-24 01:26:57.905][16][critical][backtrace] [./source/server/backtrace.h:98] #13: [0xaaaabf7f012c]
[36mproxy_1 | [0m [2023-03-24 01:26:57.906][16][critical][backtrace] [./source/server/backtrace.h:98] #14: [0xaaaabf7ef414]
[36mproxy_1 | [0m [2023-03-24 01:26:57.906][16][critical][backtrace] [./source/server/backtrace.h:98] #15: [0xaaaabf7e9288]
[36mproxy_1 | [0m [2023-03-24 01:26:57.906][16][critical][backtrace] [./source/server/backtrace.h:98] #16: [0xaaaabf7ea244]
[36mproxy_1 | [0m [2023-03-24 01:26:57.907][16][critical][backtrace] [./source/server/backtrace.h:98] #17: [0xaaaac0e64bc4]
[36mproxy_1 | [0m [2023-03-24 01:26:57.907][16][critical][backtrace] [./source/server/backtrace.h:98] #18: [0xaaaac0dfe928]
[36mproxy_1 | [0m [2023-03-24 01:26:57.907][16][critical][backtrace] [./source/server/backtrace.h:98] #19: [0xaaaac0dfca24]
[36mproxy_1 | [0m [2023-03-24 01:26:57.907][16][critical][backtrace] [./source/server/backtrace.h:98] #20: [0xaaaac0df31f0]
[36mproxy_1 | [0m [2023-03-24 01:26:57.907][16][critical][backtrace] [./source/server/backtrace.h:98] #21: [0xaaaac0df4320]
[36mproxy_1 | [0m [2023-03-24 01:26:57.908][16][critical][backtrace] [./source/server/backtrace.h:98] #22: [0xaaaac1123d54]
[36mproxy_1 | [0m [2023-03-24 01:26:57.908][16][critical][backtrace] [./source/server/backtrace.h:98] #23: [0xaaaac11225f4]
[36mproxy_1 | [0m [2023-03-24 01:26:57.908][16][critical][backtrace] [./source/server/backtrace.h:98] #24: [0xaaaac077e1d4]
[36mproxy_1 | [0m [2023-03-24 01:26:57.908][16][critical][backtrace] [./source/server/backtrace.h:98] #25: [0xaaaac1129e78]
[36mproxy_1 | [0m [2023-03-24 01:26:57.909][16][critical][backtrace] [./source/server/backtrace.h:96] #26: start_thread [0xffff9980f624]
[36mproxy_1 | [0m ConnectionImpl 0x14843f3dcd80, connecting: 0, bind_error: 0, state(): Closed, read_buffer_limit_: 1048576
[36mproxy_1 | [0m socket_:
[36mproxy_1 | [0m ListenSocketImpl 0x14843f388400, transport_protocol_:
[36mproxy_1 | [0m connection_info_provider_:
[36mproxy_1 | [0m ConnectionInfoSetterImpl 0x14843f36fd58, remote_address_: 172.20.0.20:6379, direct_remote_address_: 172.20.0.20:6379, local_address_: 172.20.0.10:49342, server_name_:
[36menvoy_proxy_1 exited with code 133
[0m
I was not able to debug this very well as I could not get Envoy to build on my machine. I also suspect that if the MULTI EXEC is treated as a write command and is sent in the 1st scenario, we would see this error there as well.
Call Stack:
If the Envoy binary is crashing, a call stack is required.
Should be included above