Skip to content

Commit

Permalink
test: Work around malloc corruption bug during OpenSSL cleanup.
Browse files Browse the repository at this point in the history
Once in a while, in release mode only, this test will display this
symptom:

```
...
record_id=163 severity=trace time="2023-Oct-03 16:22:53.911616" name="http_client" url="http://127.0.0.1:8001" msg="Read 16384 bytes of body data from stream."
record_id=164 severity=trace time="2023-Oct-03 16:22:53.911802" name="http_client" url="http://127.0.0.1:8001" msg="Read 16384 bytes of body data from stream."
record_id=165 severity=warning time="2023-Oct-03 16:22:53.912043" name="http_client" url="http://127.0.0.1:8001" msg="Client destroyed while request is still active!"
[       OK ] HttpTest.TestResponseBody (202 ms)
[----------] 1 test from HttpTest (202 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (202 ms total)
[  PASSED  ] 1 test.
corrupted double-linked list
Aborted (core dumped)
```

The backtrace reveals that it happens at the very very end, when exit
handlers are called:

```
 Program terminated with signal SIGABRT, Aborted.
 #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:44
 44	./nptl/pthread_kill.c: No such file or directory.
 (gdb) bt
 #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:44
 #1  __pthread_kill_internal (signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:78
 #2  __GI___pthread_kill (threadid=139805181667136, signo=signo@entry=6)
     at ./nptl/pthread_kill.c:89
 #3  0x00007f26ee375476 in __GI_raise (sig=sig@entry=6)
     at ../sysdeps/posix/raise.c:26
 #4  0x00007f26ee35b7f3 in __GI_abort () at ./stdlib/abort.c:79
 #5  0x00007f26ee3bc6f6 in __libc_message (action=action@entry=do_abort,
     fmt=fmt@entry=0x7f26ee50eb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
 #6  0x00007f26ee3d3d7c in malloc_printerr (
     str=str@entry=0x7f26ee50c72e "corrupted double-linked list")
     at ./malloc/malloc.c:5664
 #7  0x00007f26ee3d484c in unlink_chunk (p=<optimized out>,
     av=0x7f26ee54cc80 <main_arena>) at ./malloc/malloc.c:1635
 #8  0x00007f26ee3d49e9 in malloc_consolidate (
     av=av@entry=0x7f26ee54cc80 <main_arena>) at ./malloc/malloc.c:4780
 #9  0x00007f26ee3d5f20 in _int_free (av=0x7f26ee54cc80 <main_arena>,
     p=0x561b9a7adae0, have_lock=<optimized out>) at ./malloc/malloc.c:4674
 #10 0x00007f26ee3d84d3 in __GI___libc_free (mem=<optimized out>)
     at ./malloc/malloc.c:3391
 #11 0x00007f26eeb2017d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #12 0x00007f26eeb44d0d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #13 0x00007f26eeb1b1d5 in CRYPTO_free_ex_data ()
    from /lib/x86_64-linux-gnu/libcrypto.so.3
 #14 0x00007f26eeb13d1f in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #15 0x00007f26eeb1d929 in OPENSSL_cleanup ()
    from /lib/x86_64-linux-gnu/libcrypto.so.3
 #16 0x00007f26ee378495 in __run_exit_handlers (status=0,
     listp=0x7f26ee54c838 <__exit_funcs>,
     run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
     at ./stdlib/exit.c:113
 #17 0x00007f26ee378610 in __GI_exit (status=<optimized out>)
     at ./stdlib/exit.c:143
 #18 0x00007f26ee35cd97 in __libc_start_call_main (
     main=main@entry=0x561b9a0c0f70 <main(int, char**)>, argc=argc@entry=2,
     argv=argv@entry=0x7ffe48d637c8)
     at ../sysdeps/nptl/libc_start_call_main.h:74
 #19 0x00007f26ee35ce40 in __libc_start_main_impl (
     main=0x561b9a0c0f70 <main(int, char**)>, argc=2, argv=0x7ffe48d637c8,
     init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
     stack_end=0x7ffe48d637b8) at ../csu/libc-start.c:392
 #20 0x0000561b9a0c1a35 in _start ()
```

It is unknown what causes the corruption, and the problem only happens
in release mode with sanitizers disabled, so it's very hard to
investigate. But although the root cause isn't known, it's believed to
happen when the body has not been completely consumed, and the program
exits. Since this "don't-consume -> then exit" scenario is very
unlikely in production, work around it by making sure both handlers
have run before exiting, instead of only one of them. I tested this
for hundreds of runs, and it worked. Previously it would fail every
15-30 runs or so.

This also has the added benefit of not accidentally skipping the test
conditionals inside the body handler.

Signed-off-by: Kristian Amlie <kristian.amlie@northern.tech>
  • Loading branch information
kacf committed Oct 3, 2023
1 parent e1dadd4 commit db9ddc7
Showing 1 changed file with 11 additions and 4 deletions.
15 changes: 11 additions & 4 deletions common/http_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -591,6 +591,8 @@ TEST(HttpTest, TestMissingRequestBody) {
}

TEST(HttpTest, TestResponseBody) {
int count = 0;

TestEventLoop loop;

http::ServerConfig server_config;
Expand All @@ -601,7 +603,7 @@ TEST(HttpTest, TestResponseBody) {
[](http::ExpectedIncomingRequestPtr exp_req) {
ASSERT_TRUE(exp_req) << exp_req.error().String();
},
[&loop](http::ExpectedIncomingRequestPtr exp_req) {
[&loop, &count](http::ExpectedIncomingRequestPtr exp_req) {
ASSERT_TRUE(exp_req) << exp_req.error().String();

auto result = exp_req.value()->MakeResponse();
Expand All @@ -611,9 +613,11 @@ TEST(HttpTest, TestResponseBody) {
resp->SetHeader("Content-Length", to_string(BodyOfXes::TARGET_BODY_SIZE));
resp->SetBodyReader(make_shared<BodyOfXes>());
resp->SetStatusCodeAndMessage(200, "Success");
resp->AsyncReply([&loop](error::Error err) {
resp->AsyncReply([&loop, &count](error::Error err) {
ASSERT_EQ(error::NoError, err);
loop.Stop();
if (++count >= 2) {
loop.Stop();
}
});
});

Expand All @@ -636,7 +640,7 @@ TEST(HttpTest, TestResponseBody) {
body_writer->SetUnlimited(true);
resp->SetBodyWriter(body_writer);
},
[&received_body](http::ExpectedIncomingResponsePtr exp_resp) {
[&received_body, &loop, &count](http::ExpectedIncomingResponsePtr exp_resp) {
ASSERT_TRUE(exp_resp) << exp_resp.error().String();

vector<uint8_t> expected_body;
Expand All @@ -652,6 +656,9 @@ TEST(HttpTest, TestResponseBody) {
received_body.begin(), received_body.end(), expected_body.begin())
.first
- received_body.begin());
if (++count >= 2) {
loop.Stop();
}
});

loop.Run();
Expand Down

0 comments on commit db9ddc7

Please sign in to comment.