Implement proper partition size detection. #15

kacf · 2016-01-20T13:34:01Z

We need to go outside the comfort zone and do some low level OS
probing with syscall and unsafe pointers to achieve this.

We need to go outside the comfort zone and do some low level OS probing with syscall and unsafe pointers to achieve this.

kacf · 2016-01-20T13:50:00Z

@maciejmrowiec: Now it fails because go vet says "possible misuse of unsafe.Pointer". I know why it's complaining, and it's because the syscall API is coded in a stupid way. But it's the only way to use it, so we should probably turn this warning off in the testing with -unsafeptr=false.

maciejmrowiec · 2016-01-20T14:26:56Z

from syscall package documentation:

NOTE: This package is locked down. Code outside the standard Go repository should be migrated to use the corresponding package in the golang.org/x/sys repository.
That is also where updates required by new systems or versions should be applied. See https://golang.org/s/go1.4-syscall for more information.

maciejmrowiec · 2016-01-20T14:28:26Z

https://golang.org/pkg/syscall/

kacf · 2016-01-21T06:54:29Z

I know, but it's the exact same interface. The syscall module is not deprecated, locked down just means that they are not accepting any more specialized syscalls. But we are not using those anyway, we are using the generic one. I could migrate it, but I don't really see the point. It won't help for this error message, and it just means we get an extra "go get" dependency.

maciejmrowiec · 2016-01-21T09:46:15Z

@kacf
Right, I'll add that exception to the build system.
Stay tuned!

kacf · 2016-01-21T09:54:20Z

Thanks!

maciejmrowiec · 2016-01-21T10:02:08Z

#16

maciejmrowiec · 2016-01-21T10:52:19Z

@kacf rebase to get the new config

kacf · 2016-01-22T13:21:36Z

Something buggy with this PR. Opening a new one.

Docs & travis

Once in a while, in release mode only, this test will display this symptom: ``` ... record_id=163 severity=trace time="2023-Oct-03 16:22:53.911616" name="http_client" url="http://127.0.0.1:8001" msg="Read 16384 bytes of body data from stream." record_id=164 severity=trace time="2023-Oct-03 16:22:53.911802" name="http_client" url="http://127.0.0.1:8001" msg="Read 16384 bytes of body data from stream." record_id=165 severity=warning time="2023-Oct-03 16:22:53.912043" name="http_client" url="http://127.0.0.1:8001" msg="Client destroyed while request is still active!" [ OK ] HttpTest.TestResponseBody (202 ms) [----------] 1 test from HttpTest (202 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test suite ran. (202 ms total) [ PASSED ] 1 test. corrupted double-linked list Aborted (core dumped) ``` The backtrace reveals that it happens at the very very end, when exit handlers are called: ``` Program terminated with signal SIGABRT, Aborted. #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=139805181667136) at ./nptl/pthread_kill.c:44 44 ./nptl/pthread_kill.c: No such file or directory. (gdb) bt #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=139805181667136) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=139805181667136) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=139805181667136, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x00007f26ee375476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x00007f26ee35b7f3 in __GI_abort () at ./stdlib/abort.c:79 #5 0x00007f26ee3bc6f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f26ee50eb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #6 0x00007f26ee3d3d7c in malloc_printerr ( str=str@entry=0x7f26ee50c72e "corrupted double-linked list") at ./malloc/malloc.c:5664 #7 0x00007f26ee3d484c in unlink_chunk (p=<optimized out>, av=0x7f26ee54cc80 <main_arena>) at ./malloc/malloc.c:1635 #8 0x00007f26ee3d49e9 in malloc_consolidate ( av=av@entry=0x7f26ee54cc80 <main_arena>) at ./malloc/malloc.c:4780 #9 0x00007f26ee3d5f20 in _int_free (av=0x7f26ee54cc80 <main_arena>, p=0x561b9a7adae0, have_lock=<optimized out>) at ./malloc/malloc.c:4674 #10 0x00007f26ee3d84d3 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391 #11 0x00007f26eeb2017d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #12 0x00007f26eeb44d0d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #13 0x00007f26eeb1b1d5 in CRYPTO_free_ex_data () from /lib/x86_64-linux-gnu/libcrypto.so.3 #14 0x00007f26eeb13d1f in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #15 0x00007f26eeb1d929 in OPENSSL_cleanup () from /lib/x86_64-linux-gnu/libcrypto.so.3 #16 0x00007f26ee378495 in __run_exit_handlers (status=0, listp=0x7f26ee54c838 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:113 #17 0x00007f26ee378610 in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:143 #18 0x00007f26ee35cd97 in __libc_start_call_main ( main=main@entry=0x561b9a0c0f70 <main(int, char**)>, argc=argc@entry=2, argv=argv@entry=0x7ffe48d637c8) at ../sysdeps/nptl/libc_start_call_main.h:74 #19 0x00007f26ee35ce40 in __libc_start_main_impl ( main=0x561b9a0c0f70 <main(int, char**)>, argc=2, argv=0x7ffe48d637c8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe48d637b8) at ../csu/libc-start.c:392 #20 0x0000561b9a0c1a35 in _start () ``` It is unknown what causes the corruption, and the problem only happens in release mode with sanitizers disabled, so it's very hard to investigate. But although the root cause isn't known, it's believed to happen when the body has not been completely consumed, and the program exits. Since this "don't-consume -> then exit" scenario is very unlikely in production, work around it by making sure both handlers have run before exiting, instead of only one of them. I tested this for hundreds of runs, and it worked. Previously it would fail every 15-30 runs or so. This also has the added benefit of not accidentally skipping the test conditionals inside the body handler. Signed-off-by: Kristian Amlie <kristian.amlie@northern.tech>

Implement proper partition size detection.

530f13c

We need to go outside the comfort zone and do some low level OS probing with syscall and unsafe pointers to achieve this.

kacf closed this Jan 22, 2016

kacf mentioned this pull request Jan 22, 2016

Implement proper partition size detection. #25

Merged

GregorioDiStefano pushed a commit that referenced this pull request Jun 20, 2016

Merge pull request #15 from maciejmrowiec/master

b5e40f8

Docs & travis

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement proper partition size detection. #15

Implement proper partition size detection. #15

kacf commented Jan 20, 2016

kacf commented Jan 20, 2016

maciejmrowiec commented Jan 20, 2016

maciejmrowiec commented Jan 20, 2016

kacf commented Jan 21, 2016

maciejmrowiec commented Jan 21, 2016

kacf commented Jan 21, 2016

maciejmrowiec commented Jan 21, 2016

maciejmrowiec commented Jan 21, 2016

kacf commented Jan 22, 2016

Implement proper partition size detection. #15

Implement proper partition size detection. #15

Conversation

kacf commented Jan 20, 2016

kacf commented Jan 20, 2016

maciejmrowiec commented Jan 20, 2016

maciejmrowiec commented Jan 20, 2016

kacf commented Jan 21, 2016

maciejmrowiec commented Jan 21, 2016

kacf commented Jan 21, 2016

maciejmrowiec commented Jan 21, 2016

maciejmrowiec commented Jan 21, 2016

kacf commented Jan 22, 2016