Handle memory allocation failures gracefully #155

pesco · 2015-10-30T21:00:13Z

I'm opening this as an early request for comments.

Most code in hammer is written as if memory allocations could never fail. In particular, the result of h_arena_malloc isn't checked anywhere. This only works if the OS will always overcommit (e.g. Linux sysctl vm.overcommit_memory=1) and there are no resource limits in place (e.g. ulimit -v).

Issue #133 reported basically the same issue but focused on the system_allocator. PR #136 purports to fix it but ignores the CF backends and adds more potential NULL pointers to arena allocator code. It did add some longjmp-based "exception" handling to RVM compilation, though.

I propose to extend the longjmp approach to arena allocations such that h_arena_malloc never returns on failure but either performs a longjmp or exits. A longjmp target is set up and attached to the HArena data structure in a top-level function and low-level code can stay as it is. I've implemented this for the regex backend to demonstrate.

In addition, unchecked call sites of h_new, mm__->alloc, h_new_arena and thelike have to be identified and NULL-handling added where appropriate.

pesco · 2015-10-31T14:23:01Z

Some Travis builds fail on the following warning:

$ gcc --version
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
[...]
gcc -o build/opt/src/backends/regex.o -c -std=gnu99 -Wall -Wextra -Werror -Wno-unused-parameter -Wno-attributes -Wno-unused-variable -O3 build/opt/src/backends/regex.c
build/opt/src/backends/regex.c: In function 'h_rvm_run__m':
build/opt/src/backends/regex.c:55:12: error: variable 'heads_n' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
build/opt/src/backends/regex.c:56:6: error: variable 'heads_p' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]

This does not appear on my workstation with GCC 5.2.1 and it only seems to happen when the BINDINGS environment variable is set. My guess is that something in the bindings pre-install process (where some apt-get install calls happen) silently upgrades or somehow changes GCC.

The cause of the warning (which I can't reproduce) seems to be that the variables heads_n and heads_p get swapped in one code path. So I don't think it actually constitutes a problem (swapping is not really clobbering).

PS: The C++ binding seems to use GCC even on the Clang build variant; is that supposed to happen?

pesco · 2015-11-14T14:52:52Z

My guess is that something in the bindings pre-install process (where some apt-get install calls happen) silently upgrades or somehow changes GCC.

No, it happens with gcc -O3. For some reason, the travis builds with bindings are optimized while BINDINGS=none makes a debug build. The problem doesn't appear with clang -O3.

I reproduced the issue with GCC 4.2 on OpenBSD. It doesn't appear with GCC 5.2 on Linux.

abiggerhammer · 2015-11-16T19:17:16Z

Oh hey, it's building cleanly now. Are we done here, or is there more still to be done with the CF backends?

pesco · 2015-11-16T21:51:54Z

There is still work to do; the CF backends but especially Packrat (we need that sorted for DNP3), and lastly the NULL checks on any allocs that are not arena allocs.

I'm looking for an all-clear to go on with this relatively invasive change. I did it for regex at first precisely because that's the code I've never touched before (=> maximum invasion).

abiggerhammer · 2015-11-16T22:50:55Z

Full speed ahead!

Rationale: If memory allocation fails in the inner parse and we longjump up the stack, the temporary arena will be missed and leak. NB: This change means that any allocations done by the continuation (in the form of new parsers, probably) will persist for the lifetime of the parse result. Beware of wasting too much memory this way! The bind continuation should generally keep dynamic allocations to a minimum.

…allocation Rationale: "Basic allocation" refers to things outside of parsing proper, mostly initialization. If such allocations fail, the system is globally emory-starved from which it will likely not recover by returning failure. In this case, terminating the process is in fact the most robust strategy as it may mean the difference between a permanent hang and a temporary crash.

pesco · 2015-11-30T16:05:59Z

So, longjmps in the other backends are still TODO but I have switched all non-arena allocation to the crash-and-burn strategy of error handling. Here is what I wrote in the commit message:

add h_alloc() which calls errx() on failure and use it for all basic allocation

Rationale: "Basic allocation" refers to things outside of parsing proper,
mostly initialization. If such allocations fail, the system is globally
memory-starved from which it will likely not recover by returning failure.
In this case, terminating the process is in fact the most robust strategy as
it may mean the difference between a permanent hang and a temporary crash.

So the idea is that parsers, anything else that gets built with h_new, initial creation of arenas with h_new_arena, etc. terminate the process on failure, while arena-based allocations during the parse (using h_arena_malloc) make h_parse return NULL.

NB: We characterized the latter during our last conference call as the user getting to dynamically discover the bounds of his input language as dictated by memory.

pesco · 2015-11-30T17:16:32Z

The above commits should make h_parse return NULL for all backends when it runs out of memory during the parse. I haven't gotten around to actually triggering it, yet, though, because I fail at ulimit or something.

The iterative variants are still TODO as they are not quite as trivial (but should be easy enough).

pesco · 2015-12-02T15:33:46Z

All done, request merge.

Did the test properly this time around (with a mock allocator that selectively returns NULL).
I'd still appreciate a comment on the rationale for 3fc56a0 which I quoted above.

thequux · 2015-12-03T20:02:51Z

src/backends/llk.c

@@ -530,8 +537,6 @@ static HCountedArray *llk_parse_chunk_(HLLkState *s, const HParser* parser,
  return seq;

 no_parse:


Best to clear the exception handler here; that way we can maintain the invariant that returning to the exception handler is always valid.

Good catch. See d28f182.

thequux · 2015-12-06T04:11:45Z

LGTM

Handle memory allocation failures gracefully

pesco added 5 commits October 29, 2015 13:11

call h_arena_free when resizing in h_carray_append

d2ade1f

don't pre-allocate all space when parsing an h_length_value

3765fd6

fail an assert on h_sequence(NULL) instead of segfaulting later

d5e79aa

never return if h_arena_malloc fails, call errx() or longjmp()

9ef70f2

out of memory handling and leak fixes in regex backend

e8b1962

allow h_sequence(NULL) as parser for the empty sequence

25acd90

pesco added 2 commits November 14, 2015 23:24

Merge remote-tracking branch 'upstream/master' into fix-alloc-failures

6aa7db8

avoid GCC warning about potential longjmp clobbers

5b3cb46

pesco added 3 commits November 27, 2015 17:55

add libhammer.pc to gitignore

e26a8ff

pesco added 2 commits November 30, 2015 18:01

out-of-memory handling in packrat backend

5996477

out-of-memory handling in LL(k), LR, and GLR backends

d5f3e13

pesco added 8 commits December 1, 2015 16:48

out-of-memory support for iterative LL(k)

2309bd6

add test for out-of-memory handling

3ce4f61

reset arena jmp_bufs at end of h_llk_parse_chunk

384a7b9

add oom handling to iterative LR engine

22b5611

delete results properly in parser test macros

e89d9f9

try it with a smaller alloc

db4fd66

add __m variants to some check macros

7b13a82

test out-of-memory handling with a mock allocator

9602caf

thequux reviewed Dec 3, 2015
View reviewed changes

clear arena exception handler when exiting llk_parse_chunk_

d28f182

thequux pushed a commit that referenced this pull request Dec 6, 2015

Merge pull request #155 from pesco/fix-alloc-failures

41b890c

Handle memory allocation failures gracefully

thequux merged commit 41b890c into UpstandingHackers:master Dec 6, 2015

pesco deleted the fix-alloc-failures branch December 7, 2015 17:12

pesco mentioned this pull request Mar 8, 2016

compile fails on Ubuntu with setjump() error for efea10417cf790a4378abcf58bfcc80731e847bc #162

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle memory allocation failures gracefully #155

Handle memory allocation failures gracefully #155

pesco commented Oct 30, 2015

pesco commented Oct 31, 2015

pesco commented Nov 14, 2015

abiggerhammer commented Nov 16, 2015

pesco commented Nov 16, 2015

abiggerhammer commented Nov 16, 2015

pesco commented Nov 30, 2015

pesco commented Nov 30, 2015

pesco commented Dec 2, 2015

thequux Dec 3, 2015

pesco Dec 4, 2015

thequux commented Dec 6, 2015

		@@ -530,8 +537,6 @@ static HCountedArray llk_parse_chunk_(HLLkState s, const HParser* parser,
		return seq;

		no_parse:

Handle memory allocation failures gracefully #155

Handle memory allocation failures gracefully #155

Conversation

pesco commented Oct 30, 2015

pesco commented Oct 31, 2015

pesco commented Nov 14, 2015

abiggerhammer commented Nov 16, 2015

pesco commented Nov 16, 2015

abiggerhammer commented Nov 16, 2015

pesco commented Nov 30, 2015

pesco commented Nov 30, 2015

pesco commented Dec 2, 2015

thequux Dec 3, 2015

Choose a reason for hiding this comment

pesco Dec 4, 2015

Choose a reason for hiding this comment

thequux commented Dec 6, 2015