Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More work on image deployment #12

Merged
merged 6 commits into from
Jan 19, 2016
Merged

More work on image deployment #12

merged 6 commits into from
Jan 19, 2016

Conversation

kacf
Copy link
Member

@kacf kacf commented Jan 18, 2016

No description provided.

@kacf
Copy link
Member Author

kacf commented Jan 18, 2016

@pasinskim, @maciejmrowiec: If you want to take a look.

@kacf
Copy link
Member Author

kacf commented Jan 18, 2016

@maciejmrowiec: I don't understand what's wrong? https://travis-ci.com/mendersoftware/mender/builds/19939337

@maciejmrowiec
Copy link
Contributor

you can just this one for reference: https://github.com/mendersoftware/artifacts

(Almost) the same as using "static" in C.
Also add the testRunnerMulti to test for several commands in a row.
For this it was necessary to extend the testRunner framework to also
check command lines, since we need to make sure it configures U-Boot
correctly.

Also fix a bug in the testRunnerMulti version where the list was never
exhausted because we were modifying a copy.
- Switch away from the mock based partition testing and do the mocking
  at the command level instead.
- Add lots of tests to get the coverage up.
maciejmrowiec added a commit that referenced this pull request Jan 19, 2016
More work on image deployment
@maciejmrowiec maciejmrowiec merged commit 22b93bc into mendersoftware:master Jan 19, 2016
@kacf kacf deleted the rootfs_development branch January 20, 2016 06:52
GregorioDiStefano pushed a commit that referenced this pull request Jun 20, 2016
Port POC with some cleanups
kacf added a commit that referenced this pull request Mar 21, 2019
Document how to opt-out of LZMA Artifact compression
kacf added a commit that referenced this pull request Oct 4, 2023
Once in a while, in release mode only, this test will display this
symptom:

```
...
record_id=163 severity=trace time="2023-Oct-03 16:22:53.911616" name="http_client" url="http://127.0.0.1:8001" msg="Read 16384 bytes of body data from stream."
record_id=164 severity=trace time="2023-Oct-03 16:22:53.911802" name="http_client" url="http://127.0.0.1:8001" msg="Read 16384 bytes of body data from stream."
record_id=165 severity=warning time="2023-Oct-03 16:22:53.912043" name="http_client" url="http://127.0.0.1:8001" msg="Client destroyed while request is still active!"
[       OK ] HttpTest.TestResponseBody (202 ms)
[----------] 1 test from HttpTest (202 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (202 ms total)
[  PASSED  ] 1 test.
corrupted double-linked list
Aborted (core dumped)
```

The backtrace reveals that it happens at the very very end, when exit
handlers are called:

```
 Program terminated with signal SIGABRT, Aborted.
 #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:44
 44	./nptl/pthread_kill.c: No such file or directory.
 (gdb) bt
 #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:44
 #1  __pthread_kill_internal (signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:78
 #2  __GI___pthread_kill (threadid=139805181667136, signo=signo@entry=6)
     at ./nptl/pthread_kill.c:89
 #3  0x00007f26ee375476 in __GI_raise (sig=sig@entry=6)
     at ../sysdeps/posix/raise.c:26
 #4  0x00007f26ee35b7f3 in __GI_abort () at ./stdlib/abort.c:79
 #5  0x00007f26ee3bc6f6 in __libc_message (action=action@entry=do_abort,
     fmt=fmt@entry=0x7f26ee50eb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
 #6  0x00007f26ee3d3d7c in malloc_printerr (
     str=str@entry=0x7f26ee50c72e "corrupted double-linked list")
     at ./malloc/malloc.c:5664
 #7  0x00007f26ee3d484c in unlink_chunk (p=<optimized out>,
     av=0x7f26ee54cc80 <main_arena>) at ./malloc/malloc.c:1635
 #8  0x00007f26ee3d49e9 in malloc_consolidate (
     av=av@entry=0x7f26ee54cc80 <main_arena>) at ./malloc/malloc.c:4780
 #9  0x00007f26ee3d5f20 in _int_free (av=0x7f26ee54cc80 <main_arena>,
     p=0x561b9a7adae0, have_lock=<optimized out>) at ./malloc/malloc.c:4674
 #10 0x00007f26ee3d84d3 in __GI___libc_free (mem=<optimized out>)
     at ./malloc/malloc.c:3391
 #11 0x00007f26eeb2017d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #12 0x00007f26eeb44d0d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #13 0x00007f26eeb1b1d5 in CRYPTO_free_ex_data ()
    from /lib/x86_64-linux-gnu/libcrypto.so.3
 #14 0x00007f26eeb13d1f in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #15 0x00007f26eeb1d929 in OPENSSL_cleanup ()
    from /lib/x86_64-linux-gnu/libcrypto.so.3
 #16 0x00007f26ee378495 in __run_exit_handlers (status=0,
     listp=0x7f26ee54c838 <__exit_funcs>,
     run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
     at ./stdlib/exit.c:113
 #17 0x00007f26ee378610 in __GI_exit (status=<optimized out>)
     at ./stdlib/exit.c:143
 #18 0x00007f26ee35cd97 in __libc_start_call_main (
     main=main@entry=0x561b9a0c0f70 <main(int, char**)>, argc=argc@entry=2,
     argv=argv@entry=0x7ffe48d637c8)
     at ../sysdeps/nptl/libc_start_call_main.h:74
 #19 0x00007f26ee35ce40 in __libc_start_main_impl (
     main=0x561b9a0c0f70 <main(int, char**)>, argc=2, argv=0x7ffe48d637c8,
     init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
     stack_end=0x7ffe48d637b8) at ../csu/libc-start.c:392
 #20 0x0000561b9a0c1a35 in _start ()
```

It is unknown what causes the corruption, and the problem only happens
in release mode with sanitizers disabled, so it's very hard to
investigate. But although the root cause isn't known, it's believed to
happen when the body has not been completely consumed, and the program
exits. Since this "don't-consume -> then exit" scenario is very
unlikely in production, work around it by making sure both handlers
have run before exiting, instead of only one of them. I tested this
for hundreds of runs, and it worked. Previously it would fail every
15-30 runs or so.

This also has the added benefit of not accidentally skipping the test
conditionals inside the body handler.

Signed-off-by: Kristian Amlie <kristian.amlie@northern.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants