-
Notifications
You must be signed in to change notification settings - Fork 7.7k
Add sys_heap heap allocator #23941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sys_heap heap allocator #23941
Conversation
On Mon, 30 Mar 2020, Andy Ross wrote:
Note also that this doesn't include @npitre's optimization series, since I don't know if he has time to work on that at the moment. Those too can go in on top of this once merged.
I should be able to find some time to look at it, and port my changes to it.
Time is plentiful these days.
|
All checks are passing now. checkpatch (informational only, not a failure)
Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages. |
The development work caught up with the older PR, so now this contains the k_mem_pool compatibility layer, selectable via CONFIG_MEM_POOL_HEAP_BACKEND. It's implemented on top of a very thin "k_heap" layer, which itself does nothing but add synchronization to the sys_heap underlayment. Please re-review the new stuff if you looked at this earlier. |
lots of test failure with native_posix and native_posix_64.
Is that expected?
|
Fix whackamole regressions (see last patch in the series) -- I expanded the default size heuristic for the mem_pool backend at the last minute and broke some tests that actually relied on small heaps filling up exactly as they did before. The new allocator... failed to fail. @npitre no, it's for sure all supposed to be passing. |
Hm... that timeout in CI seems spurious. It looks like the emulator hung, maybe? (This is the riscv "renode" platform, not qemu). I'm no renode expert, but I can't reproduce locally. It seems to work fine, though surprisingly slowly (it takes 20-30 seconds to finish on my 5 GHz box!). |
looks like from the log there are some build failures on x86_64. Maybe need a UL suffix on some constants?
|
I have genuinely no idea what that is. This test builds just fine in the PR. CONFIG_SRAM_SIZE is a value in kb, and you can only overflow that in 32 bits if we expose over 4G in memory space, which we clearly don't. Is this some variant config builder it's trying out and ignoring or something? I mean, yeah, that kconfig value should be cast to a size_t for this particular line, but that's definitely not an error we see in practice. But regardless, that's not flagged as an error by CI anyway. What seems to be happening at the bottom of the log is that the renode process is stuck, or not being killed. @nashif : I'm guessing this test is actually timing out (renode seems really slow, and it's a CPU-bound test) and maybe the "kill the test" logic in sanitycheck isn't working for that simulator? |
Rebase and add a longer timeout to the lib/heap test, guessing that's what the issue with CI is. |
Hm.... looks like CI is stuck on that same spot. @nashif, when you get a chance if you could look at this and see if you can divine what the issue is? Maybe it's the version of renode? (I'm on the 1.9.0 fedora package). I might as well add the size_t cast while I'm at it, though those errors in the log don't reproduce in-tree, so I'm not sure where it's coming from. |
OK, that did it. Shippable is showing PASS now. Unfortunately there's a new failure in "nits". It don't know what that is. It's not the doc check or license check or checkpatch, the link just goes to our coding style guidelines, and nothing seems to have posted to this PR a description of what went wrong... |
I applied my optimization patches on top of this and pushed them to #24249.
Please feel free to incorporate them into your series.
|
include/kernel.h
Outdated
@@ -4790,6 +4790,11 @@ void k_heap_free(struct k_heap *h, void *mem); | |||
* If the pool is to be accessed outside the module where it is defined, it | |||
* can be declared via | |||
* | |||
* @note Note: when CONFIG_MEM_POOL_HEAP_BACKEND is enabled, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for the "Note:" here
@andyross Can you add a section in the documentation about all of this? |
We're getting off on a tangent here, but we're not adding that to slabs. That's not what slabs are used for. Slabs are pools of equally sized kernel objects, our implementation uses a simple linked list stored within unused blocks. The typical use-case is each slab member to be some kernel object of a particular type. Once you start adding support for contiguous regions this instead turns into a general purpose allocator with a minimum block size of a page, and the people who use slabs for their intended design purpose will not appreciate the footprint and overhead this entails. |
On Thu, 9 Apr 2020, Andrew Boie wrote:
We're getting off on a tangent here, but we're not adding that to slabs. That's not what slabs are used for. Slabs are pools of equally sized kernel objects, our implementation uses a simple linked list stored within unused blocks. The typical use-case is each slab member to be some kernel object of a particular type.
I know. It just seemed a better fit if you need to frequently allocate 3
contiguous pages. But I didn't think it through. Sorry for the noise.
|
The existing mem_pool implementation has been an endless source of frustration. It's had alignment bugs, it's had racy behavior. It's never been particularly fast. It's outrageously complicated to configure statically. And while its fragmentation resistance and overhead on small blocks is good, it's space efficiencey has always been very poor due to the four-way buddy scheme. This patch introduces sys_heap. It's a more or less conventional segregated fit allocator with power-of-two buckets. It doesn't expose its level structure to the user at all, simply taking an arbitrarily aligned pointer to memory. It stores all metadata inside the heap region. It allocates and frees by simple pointer and not block ID. Static initialization is trivial, and runtime initialization is only a few cycles to format and add one block to a list header. It has excellent space efficiency. Chunks can be split arbitrarily in 8 byte units. Overhead is only four bytes per allocated chunk (eight bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array of 2-word bucket headers. No coarse alignment restrictions on blocks, they can be split and merged (in units of 8 bytes) arbitrarily. It has good fragmentation resistance. Freed blocks are always immediately merged with adjacent free blocks. Allocations are attempted from a sample of the smallest bucket that might fit, falling back rapidly to the smallest block guaranteed to fit. Split memory remaining in the chunk is always returned immediately to the heap for other allocation. It has excellent performance with firmly bounded runtime. All operations are constant time (though there is a search of the smallest bucket that has a compile-time-configurable upper bound, setting this to extreme values results in an effectively linear search of the list), objectively fast (about a hundred instructions) and amenable to locked operation. No more need for fragile lock relaxation trickery. It also contains an extensive validation and stress test framework, something that was sorely lacking in the previous implementation. Note that sys_heap is not a compatible API with sys_mem_pool and k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear later and a deprecation strategy needs to be chosen. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Use the white box validation and test rig added as part of the sys_heap work. Add a layer that puts hashed cookies into the blocks to detect corruption, check the validity state after every operation, and enumerate a few different usage patterns: + Small heap, "real world" allocation where the heap is about half full and most allocations succeed. + Small heap, "fragmentation runaway" scenario where most allocations start failing, but the heap must remain consistent. + Big heap. We can't test this with the same exhaustive coverage (many re/allocations for every byte of storage) for performance reasons, but we do what we can. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Struct definitions contain no inlines that depend on other code so should live early in the include tree. Upcoming refactoring needs this to break header dependency cycles. The kernel_structs.h header was designed for exactly this purpose. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Almost all of the k_mem_pool API is implemented in terms of three lower level primitives: K_MEM_POOL_DEFINE(), k_mem_pool_alloc() and k_mem_pool_free_id(). These are themselves implemented on top of the lower level sys_mem_pool abstraction. Make this layering explicit by splitting the low level out into its own files: mempool_sys.c/h. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This adds a k_heap data structure, a synchronized wrapper around a sys_heap memory allocator. As of this patch, it is an alternative implementation to k_mem_pool() with somewhat better efficiency and performance and more conventional (and convenient) behavior. Note that commit involves some header motion to break dependencies. The declaration for struct k_spinlock moves to kernel_structs.h, and a bunch of includes were trimmed. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add a shim layer implementing the legacy k_mem_pool APIs backed by a k_heap instead of the original implementation. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The original k_mem_pool tests were a mix of code that tests routine allocator behavior, the synchronization layer above that, and a significant amount of code that made low-level assumptions about the specific memory layout of the original allocator, which doesn't run out of memory in exactly the same way. Adjust the expectations as needed for the backend. A few test cases were skipped if they were too specific. Most have been generalized (for example, iteratively allocating to use up all memory instead of assuming that it will be empty after N allocations). Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Legacy code can switch back to the original implementation where it needs it, but we don't want new code to be unintentionally dependent on the behavior of the older allocator. The new one is a better general purpose choice. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The k_heap backend is now the default for mem_pool, so duplicate these tests across that config so we continue to have coverage for the older code. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
These five tests (mbox_api, mheap_api_concept, msgq_api, pipe_api and queue) all had test cases where they needed a mem_pool allocation to FAIL. And they are all written to assume the behavior of the original allocator and not the more general k_heap code, which actually succeeds in a bunch of these cases. * Even a very small heap saves enough metadata memory for the very small minimum block size, and this can be re-used as an allocation. So you can't assume a small heap is full. * Calculating the number of blocks based on "num_blocks * max size / minimum size" and allocating them does not fill the heap, because the conservative metadata reservation leaves some space left over. So these have all been modified to "fill" a heap by iteratively allocating until failure. Also, this fixes a benign overrun bug in mbox. The test code would insert a "big" message by reading past the end of the small message buffer. This didn't fail because it happened to be part of an array of messages and the other ones defined contained the memory read. But still. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
CONFIG_SRAM_SIZE is a kconfig value, which is an int (units of kb), but when doing math on it to produce a memory buffer size needs to be done in size_t precision otherwise we could overflow on 64 bit platforms with >4G memory. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The renode emulator is REALLY slow on this test, what completes in 20 seconds on qemu takes 4-10 minutes on renode. That's causing trouble in CI. And this is a CPU-bound unit test of library code, where we have coverage for riscv32 via qemu anyway. There's no value to having better platform emulation here. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Some of the ARC platforms aren't consistent between kconfig and their linker scripts as to the size of memory, add a special case. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Rebase, fix up the "@note Note" bit. |
Any chance to get this in before it needs another rebase? |
The following PR's zephyrproject-rtos#23941 zephyrproject-rtos#23601 was merged using old boilerplate inclusion. This commit updates those tests to use find_package(Zephyr) Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
The following PR's zephyrproject-rtos#23941 zephyrproject-rtos#23601 was merged using old boilerplate inclusion. This commit updates those tests to use find_package(Zephyr) Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
This has been sitting around without much controversy. Let's resurrect it and get it into the tree. Note that this is just the bare heap implementation under a new API. I'll push the k_mem_pool wrapper separately (which, while it can implement the full API, can't really be a 100% compatible replacement because of the way the memory layout of the original pool was documented). It can live via a kconfig switch for a release before deprecating the original.
Note also that this doesn't include @npitre's optimization series, since I don't know if he has time to work on that at the moment. Those too can go in on top of this once merged.