Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement proper partition size detection. #15

Closed
wants to merge 1 commit into from
Closed

Implement proper partition size detection. #15

wants to merge 1 commit into from

Conversation

kacf
Copy link
Member

@kacf kacf commented Jan 20, 2016

We need to go outside the comfort zone and do some low level OS
probing with syscall and unsafe pointers to achieve this.

We need to go outside the comfort zone and do some low level OS
probing with syscall and unsafe pointers to achieve this.
@kacf
Copy link
Member Author

kacf commented Jan 20, 2016

@maciejmrowiec: Now it fails because go vet says "possible misuse of unsafe.Pointer". I know why it's complaining, and it's because the syscall API is coded in a stupid way. But it's the only way to use it, so we should probably turn this warning off in the testing with -unsafeptr=false.

@maciejmrowiec
Copy link
Contributor

from syscall package documentation:

NOTE: This package is locked down. Code outside the standard Go repository should be migrated to use the corresponding package in the golang.org/x/sys repository.
That is also where updates required by new systems or versions should be applied. See https://golang.org/s/go1.4-syscall for more information.

@maciejmrowiec
Copy link
Contributor

@kacf
Copy link
Member Author

kacf commented Jan 21, 2016

I know, but it's the exact same interface. The syscall module is not deprecated, locked down just means that they are not accepting any more specialized syscalls. But we are not using those anyway, we are using the generic one. I could migrate it, but I don't really see the point. It won't help for this error message, and it just means we get an extra "go get" dependency.

@maciejmrowiec
Copy link
Contributor

@kacf
Right, I'll add that exception to the build system.
Stay tuned!

@kacf
Copy link
Member Author

kacf commented Jan 21, 2016

Thanks!

@maciejmrowiec
Copy link
Contributor

#16

@maciejmrowiec
Copy link
Contributor

@kacf rebase to get the new config

@kacf
Copy link
Member Author

kacf commented Jan 22, 2016

Something buggy with this PR. Opening a new one.

@kacf kacf closed this Jan 22, 2016
GregorioDiStefano pushed a commit that referenced this pull request Jun 20, 2016
kacf added a commit that referenced this pull request Oct 4, 2023
Once in a while, in release mode only, this test will display this
symptom:

```
...
record_id=163 severity=trace time="2023-Oct-03 16:22:53.911616" name="http_client" url="http://127.0.0.1:8001" msg="Read 16384 bytes of body data from stream."
record_id=164 severity=trace time="2023-Oct-03 16:22:53.911802" name="http_client" url="http://127.0.0.1:8001" msg="Read 16384 bytes of body data from stream."
record_id=165 severity=warning time="2023-Oct-03 16:22:53.912043" name="http_client" url="http://127.0.0.1:8001" msg="Client destroyed while request is still active!"
[       OK ] HttpTest.TestResponseBody (202 ms)
[----------] 1 test from HttpTest (202 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (202 ms total)
[  PASSED  ] 1 test.
corrupted double-linked list
Aborted (core dumped)
```

The backtrace reveals that it happens at the very very end, when exit
handlers are called:

```
 Program terminated with signal SIGABRT, Aborted.
 #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:44
 44	./nptl/pthread_kill.c: No such file or directory.
 (gdb) bt
 #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:44
 #1  __pthread_kill_internal (signo=6, threadid=139805181667136)
     at ./nptl/pthread_kill.c:78
 #2  __GI___pthread_kill (threadid=139805181667136, signo=signo@entry=6)
     at ./nptl/pthread_kill.c:89
 #3  0x00007f26ee375476 in __GI_raise (sig=sig@entry=6)
     at ../sysdeps/posix/raise.c:26
 #4  0x00007f26ee35b7f3 in __GI_abort () at ./stdlib/abort.c:79
 #5  0x00007f26ee3bc6f6 in __libc_message (action=action@entry=do_abort,
     fmt=fmt@entry=0x7f26ee50eb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
 #6  0x00007f26ee3d3d7c in malloc_printerr (
     str=str@entry=0x7f26ee50c72e "corrupted double-linked list")
     at ./malloc/malloc.c:5664
 #7  0x00007f26ee3d484c in unlink_chunk (p=<optimized out>,
     av=0x7f26ee54cc80 <main_arena>) at ./malloc/malloc.c:1635
 #8  0x00007f26ee3d49e9 in malloc_consolidate (
     av=av@entry=0x7f26ee54cc80 <main_arena>) at ./malloc/malloc.c:4780
 #9  0x00007f26ee3d5f20 in _int_free (av=0x7f26ee54cc80 <main_arena>,
     p=0x561b9a7adae0, have_lock=<optimized out>) at ./malloc/malloc.c:4674
 #10 0x00007f26ee3d84d3 in __GI___libc_free (mem=<optimized out>)
     at ./malloc/malloc.c:3391
 #11 0x00007f26eeb2017d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #12 0x00007f26eeb44d0d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #13 0x00007f26eeb1b1d5 in CRYPTO_free_ex_data ()
    from /lib/x86_64-linux-gnu/libcrypto.so.3
 #14 0x00007f26eeb13d1f in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3
 #15 0x00007f26eeb1d929 in OPENSSL_cleanup ()
    from /lib/x86_64-linux-gnu/libcrypto.so.3
 #16 0x00007f26ee378495 in __run_exit_handlers (status=0,
     listp=0x7f26ee54c838 <__exit_funcs>,
     run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
     at ./stdlib/exit.c:113
 #17 0x00007f26ee378610 in __GI_exit (status=<optimized out>)
     at ./stdlib/exit.c:143
 #18 0x00007f26ee35cd97 in __libc_start_call_main (
     main=main@entry=0x561b9a0c0f70 <main(int, char**)>, argc=argc@entry=2,
     argv=argv@entry=0x7ffe48d637c8)
     at ../sysdeps/nptl/libc_start_call_main.h:74
 #19 0x00007f26ee35ce40 in __libc_start_main_impl (
     main=0x561b9a0c0f70 <main(int, char**)>, argc=2, argv=0x7ffe48d637c8,
     init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
     stack_end=0x7ffe48d637b8) at ../csu/libc-start.c:392
 #20 0x0000561b9a0c1a35 in _start ()
```

It is unknown what causes the corruption, and the problem only happens
in release mode with sanitizers disabled, so it's very hard to
investigate. But although the root cause isn't known, it's believed to
happen when the body has not been completely consumed, and the program
exits. Since this "don't-consume -> then exit" scenario is very
unlikely in production, work around it by making sure both handlers
have run before exiting, instead of only one of them. I tested this
for hundreds of runs, and it worked. Previously it would fail every
15-30 runs or so.

This also has the added benefit of not accidentally skipping the test
conditionals inside the body handler.

Signed-off-by: Kristian Amlie <kristian.amlie@northern.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants