Skip to content

Handle malloc failures #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 27, 2022
Merged

Conversation

mortie
Copy link
Contributor

@mortie mortie commented Oct 13, 2020

This PR adds handling for malloc failures. I don't quite know how one would add sane test cases to cover it, but I did test it manually by replaceing malloc with a malloc implementation which returns NULL 10% of the time, and all of the existing test cases run fine, with no memory errors or memory leaks. The only problem was one place in smoke.c which uses the return value of te_interp without checking for errors.

The approach to error checking is the most basic one; just add a whole bunch of if (ret == NULL) { ...; return NULL; } everywhere. It's not the prettiest, but it works. The alternative would be to setjmp at the beginning of te_compile, then longjmp out of new_expr. It would probably be faster, but setjmp and longjmp is honestly really scary, with a lot of space for accidentally stepping into undefined behavior (for example, all variables which are modified between the setjmp and the longjmp must be volatile). A longjmp-based approach would probably be faster though.


The reason behind doing this isn't just that I think malloc failures should be handled in general. I also think this is a possible approach for custom allocators such as allocators which allocate from a pool of static memory. Here's how I imagine a viable path to statically allocated tinyexpr:

  • Once malloc failures are handled gracefully, add TE_MALLOC and TE_FREE macros which default to malloc and free but can be overwritten at compile time.
  • I can then add -DTE_MALLOC=static_te_malloc -DTE_FREE=static_te_free to my tinyexpr.c compile options.
  • I can then define those functions like this in my own code:
static _Alignas(8) char buffer[1024];
static size_t allocs = 0;
static size_t offset = 0;

void *static_te_malloc(size_t count) {
    while (count % 8 != 0) count += 1; // Align
    if (offset + count >= sizeof(buffer)) {
        return NULL; // This will be handled properly by tinyexpr
    }

    void *ptr = buffer + offset;
    offset += count;
    allocs += 1;
    return ptr;
}

void static_te_free(void *ptr) {
    if (ptr == NULL) return;
    allocs -= 1;
    if (allocs == 0) {
        offset = 0; // All memory freed
    }
}

I personally find this to be a cleaner, simpler and more flexible solution than adding explicit support for using a static memory pool to tinyexpr itself.

I decided to not include those macros in this PR, because it's kind of orthogonal, and I think handling malloc failures has value regardless of whether or not you agree with my idea for supporting custom allocators.


Obviously, adding extra ifs everywhere has a performance cost (though only at te_compile time, not te_eval time). Here are times I got from running hyperfine ./bench.orig ./bench.new (where bench.new is the benchmarking suite with this patch):

Benchmark #1: ./bench.orig
  Time (mean ± σ):     17.170 s ±  0.178 s    [User: 17.164 s, System: 0.004 s]
  Range (min … max):   16.818 s … 17.485 s    10 runs

Benchmark #2: ./bench.new
  Time (mean ± σ):     18.342 s ±  1.088 s    [User: 18.341 s, System: 0.001 s]
  Range (min … max):   17.937 s … 21.431 s    10 runs

Summary
  './bench.orig' ran
    1.07 ± 0.06 times faster than './bench.new'

Here's the output from the unmodified benchmark suite on my machine:

martin@ubun ~/src/tinyexpr master $ ./bench.orig
Expression: a+5
native  5.0045e+11        260ms   384mfps
interp  5.0045e+11        621ms   161mfps
138.85% longer

Expression: 5+a+5
native  5.0095e+11        224ms   446mfps
interp  5.0095e+11        984ms   101mfps
339.29% longer

Expression: abs(a+5)
native  5.0045e+11        222ms   450mfps
interp  5.0045e+11        798ms   125mfps
259.46% longer

Expression: sqrt(a^1.5+a^2.5)
native  4.4443e+12       3323ms    30mfps
interp  4.4443e+12       4696ms    21mfps
41.32% longer

Expression: a+(5*2)
native  5.0095e+11        223ms   448mfps
interp  5.0095e+11        637ms   156mfps
185.65% longer

Expression: (a+5)*2
native  1.0009e+12        225ms   444mfps
interp  1.0009e+12        981ms   101mfps
336.00% longer

Expression: (1/(a+1)+2/(a+2)+3/(a+3))
native  5.2226e+05        297ms   336mfps
interp  5.2226e+05       3241ms    30mfps
991.25% longer

And the benchmark suite with this patch:

martin@ubun ~/src/tinyexpr master $ ./bench.new
Expression: a+5
native  5.0045e+11        259ms   386mfps
interp  5.0045e+11        698ms   143mfps
169.50% longer

Expression: 5+a+5
native  5.0095e+11        223ms   448mfps
interp  5.0095e+11       1169ms    85mfps
424.22% longer

Expression: abs(a+5)
native  5.0045e+11        223ms   448mfps
interp  5.0045e+11        892ms   112mfps
300.00% longer

Expression: sqrt(a^1.5+a^2.5)
native  4.4443e+12       3340ms    29mfps
interp  4.4443e+12       4671ms    21mfps
39.85% longer

Expression: a+(5*2)
native  5.0095e+11        224ms   446mfps
interp  5.0095e+11        758ms   131mfps
238.39% longer

Expression: (a+5)*2
native  1.0009e+12        223ms   448mfps
interp  1.0009e+12       1150ms    86mfps
415.70% longer

Expression: (1/(a+1)+2/(a+2)+3/(a+3))
native  5.2226e+05        327ms   305mfps
interp  5.2226e+05       3834ms    26mfps
1072.48% longer

@mortie
Copy link
Contributor Author

mortie commented Oct 13, 2020

This is largely an alternative to #57. If someone who needs static memory (@stawiski? @RallyTronics?) could have a look at my custom allocators suggestion and see if it would work for them, that would be awesome.

@codeplea
Copy link
Owner

codeplea commented Oct 14, 2020

Good work, and really nice job on the write-up!

I like your approach to the memory checks - honestly this is how it should have been done from the start.

My only concern is that it adds a lot of lines of code. One of the selling points of this library is that it's small.

Would you consider replacing some/most of the checks with a macro?

I'm thinking

            if (insert == NULL) {
                te_free(p);
                te_free(ret);
                return NULL;
            }

could become something like

CHECK_NULL(insert, te_free(p); te_free(ret););

It might not work for all cases, but I think it could still make it a lot more concise.

@mortie
Copy link
Contributor Author

mortie commented Oct 14, 2020

That's a good idea. It takes this patch from a +168/-7 to a +90/-7.

https://github.com/codeplea/tinyexpr/pull/68/files#diff-9dd905150ef73f001f71a847631eaf1c3c8cf53b7cb0ec921a200620fdf9cd04R375-R383 This was the only place I couldn't use the CHECK_NULL macro. However, if we changed new_expr to always 0-initialize all parameters up to arity, it could be replaced with CHECK_NULL(ret->parameters[i], te_free(ret)), because te_free would be able to handle it. What do you think?

@mortie mortie force-pushed the feature/handle-malloc-error branch from a2311af to c8d5b43 Compare October 14, 2020 08:20
@mortie
Copy link
Contributor Author

mortie commented Oct 14, 2020

Actually, nevermind. I didn't realize new_expr already zero-initializes the entire expression, including the parameters. I force-pushed a version which uses CHECK_NULL in the ret->parameters[i] case too, turning this into a +83/-7.

@tylov
Copy link

tylov commented Oct 23, 2020

This approach can be interesting for applications allocation large blocks of data, but it is highly debatable whether there is a need for this kind of resource checking on such small amount of data we talk about here.

If malloc fails in this library, your system probably either already bogged down to a halt, and there is small chances to recover anyway. On my 16GB mem windows machine I can currently allocate 45 GB with malloc (either in one block or in 1GB blocks), and resrouce monitor tells me I already use 9GBs.

I personal feel these things should be managed by the OS in some way, and not by each and every small library.

@mortie
Copy link
Contributor Author

mortie commented Oct 25, 2020

If malloc fails in this library, your system probably either already bogged down to a halt, and there is small chances to recover anyway.

That's true in the very limited case of desktop/server Linux systems with overcommit enabled. It's not true for embedded, not true for Linux without overcommit, not for some other operating systems, not even true for Linux with overcommit enabled but with ulimits in place.

Plus, even if it was 100% true in all cases, part of the reasoning for this patch is to support other allocation schemes such as allocating from one static buffer. It doesn't matter that malloc won't fail if you replace the allocator with one which allocates from a static 8k buffer. This is what the second part of the PR description was about.

What could be interesting though is an option for disabling checked malloc, if that's actually a need people have. One could add an option to replace the CHECK_NULL macro with something like an empty #define CHECK_NULL(...). I personally wouldn't consider the small code size and performance benefits to be worth it, but maybe other people have different considerations.

@tylov
Copy link

tylov commented Oct 25, 2020

I looked over this PR again, and the changes and overhead is minimal as you also claim. Thumbs up from me.

@jussike
Copy link

jussike commented Sep 27, 2022

@codeplea Could you merge this PR?

@codeplea codeplea merged commit 373d2d3 into codeplea:master Sep 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants