-
Notifications
You must be signed in to change notification settings - Fork 14k
[libc] Use best-fit binary trie to make malloc logarithmic #106259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4d676ff
to
cbab5b2
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
3cc6937
to
957140c
Compare
221cbcd
to
d6d5b6a
Compare
@llvm/pr-subscribers-libc Author: Daniel Thornburgh (mysterymath) ChangesThis reworks the free store implementation in libc's malloc to use a dlmalloc-style binary trie of circularly linked FIFO free lists. This data structure can be maintained in logarithmic time, but it still permits a relatively small implementation compared to other logarithmic-time ordered maps. The implementation doesn't do the various bitwise tricks or optimizations used in actual dlmalloc; it instead optimizes for (relative) readability and minimum code size. Specific optimization can be added as necessary given future profiling. Patch is 51.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106259.diff 13 Files Affected:
diff --git a/libc/config/linux/x86_64/entrypoints.txt b/libc/config/linux/x86_64/entrypoints.txt
index dd658af3bfb674..42d6c5d85a7997 100644
--- a/libc/config/linux/x86_64/entrypoints.txt
+++ b/libc/config/linux/x86_64/entrypoints.txt
@@ -202,6 +202,7 @@ set(TARGET_LIBC_ENTRYPOINTS
libc.src.stdlib.aligned_alloc
libc.src.stdlib.calloc
libc.src.stdlib.free
+ libc.src.stdlib.freelist_malloc
libc.src.stdlib.malloc
libc.src.stdlib.realloc
diff --git a/libc/src/__support/block.h b/libc/src/__support/block.h
index 96021b99587c87..e1f6726b27f08e 100644
--- a/libc/src/__support/block.h
+++ b/libc/src/__support/block.h
@@ -174,16 +174,32 @@ class Block {
return inner_size - sizeof(prev_) + BLOCK_OVERHEAD;
}
- /// @returns The number of usable bytes inside the block.
+ /// @returns The number of usable bytes inside the block were it to be
+ /// allocated.
size_t inner_size() const {
if (!next())
return 0;
return inner_size(outer_size());
}
+ /// @returns The number of usable bytes inside a block with the given outer
+ /// size were it to be allocated.
static size_t inner_size(size_t outer_size) {
// The usable region includes the prev_ field of the next block.
- return outer_size - BLOCK_OVERHEAD + sizeof(prev_);
+ return inner_size_free(outer_size) + sizeof(prev_);
+ }
+
+ /// @returns The number of usable bytes inside the block if it remains free.
+ size_t inner_size_free() const {
+ if (!next())
+ return 0;
+ return inner_size_free(outer_size());
+ }
+
+ /// @returns The number of usable bytes inside a block with the given outer
+ /// size if it remains free.
+ static size_t inner_size_free(size_t outer_size) {
+ return outer_size - BLOCK_OVERHEAD;
}
/// @returns A pointer to the usable space inside this block.
diff --git a/libc/src/__support/freelist.h b/libc/src/__support/freelist.h
index a54cf953fe7ab6..0525f0f62b9229 100644
--- a/libc/src/__support/freelist.h
+++ b/libc/src/__support/freelist.h
@@ -1,4 +1,4 @@
-//===-- Interface for freelist_malloc -------------------------------------===//
+//===-- Interface for freelist --------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -9,199 +9,107 @@
#ifndef LLVM_LIBC_SRC___SUPPORT_FREELIST_H
#define LLVM_LIBC_SRC___SUPPORT_FREELIST_H
-#include "src/__support/CPP/array.h"
-#include "src/__support/CPP/cstddef.h"
-#include "src/__support/CPP/new.h"
-#include "src/__support/CPP/span.h"
-#include "src/__support/fixedvector.h"
-#include "src/__support/macros/config.h"
+#include "block.h"
namespace LIBC_NAMESPACE_DECL {
-using cpp::span;
-
-/// Basic [freelist](https://en.wikipedia.org/wiki/Free_list) implementation
-/// for an allocator. This implementation buckets by chunk size, with a list
-/// of user-provided buckets. Each bucket is a linked list of storage chunks.
-/// Because this freelist uses the added chunks themselves as list nodes, there
-/// is a lower bound of `sizeof(FreeList.FreeListNode)` bytes for chunks which
-/// can be added to this freelist. There is also an implicit bucket for
-/// "everything else", for chunks which do not fit into a bucket.
-///
-/// Each added chunk will be added to the smallest bucket under which it fits.
-/// If it does not fit into any user-provided bucket, it will be added to the
-/// default bucket.
-///
-/// As an example, assume that the `FreeList` is configured with buckets of
-/// sizes {64, 128, 256, and 512} bytes. The internal state may look like the
-/// following:
-///
-/// @code{.unparsed}
-/// bucket[0] (64B) --> chunk[12B] --> chunk[42B] --> chunk[64B] --> NULL
-/// bucket[1] (128B) --> chunk[65B] --> chunk[72B] --> NULL
-/// bucket[2] (256B) --> NULL
-/// bucket[3] (512B) --> chunk[312B] --> chunk[512B] --> chunk[416B] --> NULL
-/// bucket[4] (implicit) --> chunk[1024B] --> chunk[513B] --> NULL
-/// @endcode
+/// A circularly-linked FIFO list storing free Blocks. All Blocks on a list
+/// are the same size.
///
-/// Note that added chunks should be aligned to a 4-byte boundary.
-template <size_t NUM_BUCKETS = 6> class FreeList {
+/// Allocating free blocks in FIFO order maximizes the amount of time before a
+/// free block is reused. This in turn maximizes the number of opportunities for
+/// it to be coalesced with an adjacent block, which tends to reduce heap
+/// fragmentation.
+class FreeList {
public:
- // Remove copy/move ctors
- FreeList(const FreeList &other) = delete;
- FreeList(FreeList &&other) = delete;
- FreeList &operator=(const FreeList &other) = delete;
- FreeList &operator=(FreeList &&other) = delete;
-
- /// Adds a chunk to this freelist.
- bool add_chunk(cpp::span<cpp::byte> chunk);
-
- /// Finds an eligible chunk for an allocation of size `size`.
- ///
- /// @note This returns the first allocation possible within a given bucket;
- /// It does not currently optimize for finding the smallest chunk.
- ///
- /// @returns
- /// * On success - A span representing the chunk.
- /// * On failure (e.g. there were no chunks available for that allocation) -
- /// A span with a size of 0.
- cpp::span<cpp::byte> find_chunk(size_t size) const;
-
- template <typename Cond> cpp::span<cpp::byte> find_chunk_if(Cond op) const;
-
- /// Removes a chunk from this freelist.
- bool remove_chunk(cpp::span<cpp::byte> chunk);
-
- /// For a given size, find which index into chunks_ the node should be written
- /// to.
- constexpr size_t find_chunk_ptr_for_size(size_t size, bool non_null) const;
-
- struct FreeListNode {
- FreeListNode *next;
- size_t size;
- };
-
- constexpr void set_freelist_node(FreeListNode &node,
- cpp::span<cpp::byte> chunk);
-
- constexpr explicit FreeList(const cpp::array<size_t, NUM_BUCKETS> &sizes)
- : chunks_(NUM_BUCKETS + 1, 0), sizes_(sizes.begin(), sizes.end()) {}
-
-private:
- FixedVector<FreeList::FreeListNode *, NUM_BUCKETS + 1> chunks_;
- FixedVector<size_t, NUM_BUCKETS> sizes_;
-};
-
-template <size_t NUM_BUCKETS>
-constexpr void FreeList<NUM_BUCKETS>::set_freelist_node(FreeListNode &node,
- span<cpp::byte> chunk) {
- // Add it to the correct list.
- size_t chunk_ptr = find_chunk_ptr_for_size(chunk.size(), false);
- node.size = chunk.size();
- node.next = chunks_[chunk_ptr];
- chunks_[chunk_ptr] = &node;
-}
+ class Node {
+ public:
+ /// @returns The block containing this node.
+ Block<> *block() const {
+ return const_cast<Block<> *>(Block<>::from_usable_space(this));
+ }
-template <size_t NUM_BUCKETS>
-bool FreeList<NUM_BUCKETS>::add_chunk(span<cpp::byte> chunk) {
- // Check that the size is enough to actually store what we need
- if (chunk.size() < sizeof(FreeListNode))
- return false;
+ /// @returns The inner size of blocks in the list containing this node.
+ size_t size() const { return block()->inner_size(); }
- FreeListNode *node = ::new (chunk.data()) FreeListNode;
- set_freelist_node(*node, chunk);
+ private:
+ // Circularly linked pointers to adjacent nodes.
+ Node *prev;
+ Node *next;
+ friend class FreeList;
+ };
- return true;
-}
+ constexpr FreeList() : FreeList(nullptr) {}
+ constexpr FreeList(Node *begin) : begin_(begin) {}
-template <size_t NUM_BUCKETS>
-template <typename Cond>
-span<cpp::byte> FreeList<NUM_BUCKETS>::find_chunk_if(Cond op) const {
- for (FreeListNode *node : chunks_) {
- while (node != nullptr) {
- span<cpp::byte> chunk(reinterpret_cast<cpp::byte *>(node), node->size);
- if (op(chunk))
- return chunk;
+ bool empty() const { return !begin_; }
- node = node->next;
- }
+ /// @returns The inner size of blocks in the list.
+ size_t size() const {
+ LIBC_ASSERT(begin_ && "empty lists have no size");
+ return begin_->size();
}
- return {};
-}
+ /// @returns The first node in the list.
+ Node *begin() { return begin_; }
-template <size_t NUM_BUCKETS>
-span<cpp::byte> FreeList<NUM_BUCKETS>::find_chunk(size_t size) const {
- if (size == 0)
- return span<cpp::byte>();
+ /// @returns The first block in the list.
+ Block<> *front() { return begin_->block(); }
- size_t chunk_ptr = find_chunk_ptr_for_size(size, true);
+ /// Push a block to the back of the list.
+ /// The block must be large enough to contain a node.
+ void push(Block<> *block);
- // Check that there's data. This catches the case where we run off the
- // end of the array
- if (chunks_[chunk_ptr] == nullptr)
- return span<cpp::byte>();
+ /// Push an already-constructed node to the back of the list.
+ /// This allows pushing derived node types with additional data.
+ void push(Node *node);
- // Now iterate up the buckets, walking each list to find a good candidate
- for (size_t i = chunk_ptr; i < chunks_.size(); i++) {
- FreeListNode *node = chunks_[static_cast<unsigned short>(i)];
+ /// Pop the first node from the list.
+ void pop();
- while (node != nullptr) {
- if (node->size >= size)
- return span<cpp::byte>(reinterpret_cast<cpp::byte *>(node), node->size);
+ /// Remove an arbitrary node from the list.
+ void remove(Node *node);
- node = node->next;
- }
- }
+private:
+ Node *begin_;
+};
- // If we get here, we've checked every block in every bucket. There's
- // nothing that can support this allocation.
- return span<cpp::byte>();
+LIBC_INLINE void FreeList::push(Block<> *block) {
+ LIBC_ASSERT(!block->used() && "only free blocks can be placed on free lists");
+ LIBC_ASSERT(block->inner_size_free() >= sizeof(FreeList) &&
+ "block too small to accomodate free list node");
+ push(new (block->usable_space()) Node);
}
-template <size_t NUM_BUCKETS>
-bool FreeList<NUM_BUCKETS>::remove_chunk(span<cpp::byte> chunk) {
- size_t chunk_ptr = find_chunk_ptr_for_size(chunk.size(), true);
-
- // Check head first.
- if (chunks_[chunk_ptr] == nullptr)
- return false;
-
- FreeListNode *node = chunks_[chunk_ptr];
- if (reinterpret_cast<cpp::byte *>(node) == chunk.data()) {
- chunks_[chunk_ptr] = node->next;
- return true;
+LIBC_INLINE void FreeList::push(Node *node) {
+ if (begin_) {
+ LIBC_ASSERT(Block<>::from_usable_space(node)->outer_size() ==
+ begin_->block()->outer_size() &&
+ "freelist entries must have the same size");
+ // Since the list is circular, insert the node immediately before begin_.
+ node->prev = begin_->prev;
+ node->next = begin_;
+ begin_->prev->next = node;
+ begin_->prev = node;
+ } else {
+ begin_ = node->prev = node->next = node;
}
-
- // No? Walk the nodes.
- node = chunks_[chunk_ptr];
-
- while (node->next != nullptr) {
- if (reinterpret_cast<cpp::byte *>(node->next) == chunk.data()) {
- // Found it, remove this node out of the chain
- node->next = node->next->next;
- return true;
- }
-
- node = node->next;
- }
-
- return false;
}
-template <size_t NUM_BUCKETS>
-constexpr size_t
-FreeList<NUM_BUCKETS>::find_chunk_ptr_for_size(size_t size,
- bool non_null) const {
- size_t chunk_ptr = 0;
- for (chunk_ptr = 0u; chunk_ptr < sizes_.size(); chunk_ptr++) {
- if (sizes_[chunk_ptr] >= size &&
- (!non_null || chunks_[chunk_ptr] != nullptr)) {
- break;
- }
+LIBC_INLINE void FreeList::pop() { remove(begin_); }
+
+LIBC_INLINE void FreeList::remove(Node *node) {
+ LIBC_ASSERT(begin_ && "cannot remove from empty list");
+ if (node == node->next) {
+ LIBC_ASSERT(node == begin_ &&
+ "a self-referential node must be the only element");
+ begin_ = nullptr;
+ } else {
+ node->prev->next = node->next;
+ node->next->prev = node->prev;
+ if (begin_ == node)
+ begin_ = node->next;
}
-
- return chunk_ptr;
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/__support/freelist_heap.h b/libc/src/__support/freelist_heap.h
index 6c860d039553ab..7636d431f6bc26 100644
--- a/libc/src/__support/freelist_heap.h
+++ b/libc/src/__support/freelist_heap.h
@@ -12,7 +12,7 @@
#include <stddef.h>
#include "block.h"
-#include "freelist.h"
+#include "freestore.h"
#include "src/__support/CPP/optional.h"
#include "src/__support/CPP/span.h"
#include "src/__support/libc_assert.h"
@@ -30,21 +30,14 @@ using cpp::span;
inline constexpr bool IsPow2(size_t x) { return x && (x & (x - 1)) == 0; }
-static constexpr cpp::array<size_t, 6> DEFAULT_BUCKETS{16, 32, 64,
- 128, 256, 512};
-
-template <size_t NUM_BUCKETS = DEFAULT_BUCKETS.size()> class FreeListHeap {
+class FreeListHeap {
public:
using BlockType = Block<>;
- using FreeListType = FreeList<NUM_BUCKETS>;
-
- static constexpr size_t MIN_ALIGNMENT =
- cpp::max(BlockType::ALIGNMENT, alignof(max_align_t));
- constexpr FreeListHeap() : begin_(&_end), end_(&__llvm_libc_heap_limit) {}
+ constexpr FreeListHeap() : begin(&_end), end(&__llvm_libc_heap_limit) {}
constexpr FreeListHeap(span<cpp::byte> region)
- : begin_(region.begin()), end_(region.end()) {}
+ : begin(region.begin()), end(region.end()) {}
void *allocate(size_t size);
void *aligned_allocate(size_t alignment, size_t size);
@@ -54,89 +47,75 @@ template <size_t NUM_BUCKETS = DEFAULT_BUCKETS.size()> class FreeListHeap {
void *realloc(void *ptr, size_t size);
void *calloc(size_t num, size_t size);
- cpp::span<cpp::byte> region() const { return {begin_, end_}; }
+ cpp::span<cpp::byte> region() const { return {begin, end}; }
private:
void init();
void *allocate_impl(size_t alignment, size_t size);
- span<cpp::byte> block_to_span(BlockType *block) {
+ span<cpp::byte> block_to_span(Block<> *block) {
return span<cpp::byte>(block->usable_space(), block->inner_size());
}
- bool is_valid_ptr(void *ptr) { return ptr >= begin_ && ptr < end_; }
+ bool is_valid_ptr(void *ptr) { return ptr >= begin && ptr < end; }
- bool is_initialized_ = false;
- cpp::byte *begin_;
- cpp::byte *end_;
- FreeListType freelist_{DEFAULT_BUCKETS};
+ cpp::byte *begin;
+ cpp::byte *end;
+ bool is_initialized = false;
+ FreeStore free_store;
};
-template <size_t BUFF_SIZE, size_t NUM_BUCKETS = DEFAULT_BUCKETS.size()>
-class FreeListHeapBuffer : public FreeListHeap<NUM_BUCKETS> {
- using parent = FreeListHeap<NUM_BUCKETS>;
- using FreeListNode = typename parent::FreeListType::FreeListNode;
-
+template <size_t BUFF_SIZE> class FreeListHeapBuffer : public FreeListHeap {
public:
- constexpr FreeListHeapBuffer()
- : FreeListHeap<NUM_BUCKETS>{buffer}, buffer{} {}
+ constexpr FreeListHeapBuffer() : FreeListHeap{buffer}, buffer{} {}
private:
cpp::byte buffer[BUFF_SIZE];
};
-template <size_t NUM_BUCKETS> void FreeListHeap<NUM_BUCKETS>::init() {
- LIBC_ASSERT(!is_initialized_ && "duplicate initialization");
- auto result = BlockType::init(region());
- BlockType *block = *result;
- freelist_.add_chunk(block_to_span(block));
- is_initialized_ = true;
+inline void FreeListHeap::init() {
+ LIBC_ASSERT(!is_initialized && "duplicate initialization");
+ auto result = Block<>::init(region());
+ Block<> *block = *result;
+ free_store.set_range({0, cpp::bit_ceil(block->inner_size())});
+ free_store.insert(block);
+ is_initialized = true;
}
-template <size_t NUM_BUCKETS>
-void *FreeListHeap<NUM_BUCKETS>::allocate_impl(size_t alignment, size_t size) {
+inline void *FreeListHeap::allocate_impl(size_t alignment, size_t size) {
if (size == 0)
return nullptr;
- if (!is_initialized_)
+ if (!is_initialized)
init();
- // Find a chunk in the freelist. Split it if needed, then return.
- auto chunk =
- freelist_.find_chunk_if([alignment, size](span<cpp::byte> chunk) {
- BlockType *block = BlockType::from_usable_space(chunk.data());
- return block->can_allocate(alignment, size);
- });
+ size_t request_size =
+ alignment <= alignof(max_align_t) ? size : size + alignment - 1;
- if (chunk.data() == nullptr)
+ Block<> *block = free_store.remove_best_fit(request_size);
+ if (!block)
return nullptr;
- freelist_.remove_chunk(chunk);
- BlockType *chunk_block = BlockType::from_usable_space(chunk.data());
- LIBC_ASSERT(!chunk_block->used());
+ LIBC_ASSERT(block->can_allocate(alignment, size) &&
+ "block should always be large enough to allocate at the correct "
+ "alignment");
- // Split that chunk. If there's a leftover chunk, add it to the freelist
- auto block_info = BlockType::allocate(chunk_block, alignment, size);
+ auto block_info = Block<>::allocate(block, alignment, size);
if (block_info.next)
- freelist_.add_chunk(block_to_span(block_info.next));
+ free_store.insert(block_info.next);
if (block_info.prev)
- freelist_.add_chunk(block_to_span(block_info.prev));
- chunk_block = block_info.block;
-
- chunk_block->mark_used();
+ free_store.insert(block_info.prev);
- return chunk_block->usable_space();
+ block_info.block->mark_used();
+ return block_info.block->usable_space();
}
-template <size_t NUM_BUCKETS>
-void *FreeListHeap<NUM_BUCKETS>::allocate(size_t size) {
- return allocate_impl(MIN_ALIGNMENT, size);
+inline void *FreeListHeap::allocate(size_t size) {
+ return allocate_impl(alignof(max_align_t), size);
}
-template <size_t NUM_BUCKETS>
-void *FreeListHeap<NUM_BUCKETS>::aligned_allocate(size_t alignment,
- size_t size) {
+inline void *FreeListHeap::aligned_allocate(size_t alignment, size_t size) {
// The alignment must be an integral power of two.
if (!IsPow2(alignment))
return nullptr;
@@ -148,38 +127,37 @@ void *FreeListHeap<NUM_BUCKETS>::aligned_allocate(size_t alignment,
return allocate_impl(alignment, size);
}
-template <size_t NUM_BUCKETS> void FreeListHeap<NUM_BUCKETS>::free(void *ptr) {
+inline void FreeListHeap::free(void *ptr) {
cpp::byte *bytes = static_cast<cpp::byte *>(ptr);
LIBC_ASSERT(is_valid_ptr(bytes) && "Invalid pointer");
- BlockType *chunk_block = BlockType::from_usable_space(bytes);
- LIBC_ASSERT(chunk_block->next() && "sentinel last block cannot be freed");
- LIBC_ASSERT(chunk_block->used() && "The block is not in-use");
- chunk_block->mark_free();
+ Block<> *block = Block<>::from_usable_space(bytes);
+ LIBC_ASSERT(block->next() && "sentinel last block cannot be freed");
+ LIBC_ASSERT(block->used() && "double free");
+ block->mark_free();
// Can we combine with the left or right blocks?
- BlockType *prev_free = chunk_block->prev_free();
- BlockType *next = chunk_block->next();
+ Block<> *prev_free = block->prev_free();
+ Block<> *next = block->next();
if (prev_free != nullptr) {
- // Remove from freelist and merge
- freelist_.remove_chunk(block_to_span(prev_free));
- chunk_block = prev_free;
- chunk_block->merge_next();
+ // Remove from free store and merge.
+ free_store.remove(prev_free);
+ block = prev_free;
+ block->merge_next();
}
if (!next->used()) {
- freelist_.remove_chunk(block_to_span(next));
- chunk_block->merge_next();
+ free_store.remove(next);
+ block->merge_next();
}
// Add back to the freelist
- freelist_.add_chunk(block_to_span(chunk_block));
+ free_store.insert(block);
}
// Follows constract of the C standard realloc() function
// If ptr is free'd, will return nullptr.
-template <size_t NUM_BUCKETS>
-void *FreeListHeap<NUM_BUCKETS>::realloc(void *ptr, size_t size) {
+inline void *FreeListHeap::realloc(void *ptr, size_t size) {
if (size == 0) {
free(ptr);
return nullptr;
@@ -194,10 +172,10 @@ void *FreeListHeap<NUM_BUCKETS>::realloc(void *ptr, size_t size) {
if (!is_valid_ptr(bytes))
return nullptr;
- BlockType *chunk_block = BlockType::from_usable_space(bytes);
- if (!chunk_block->used())
+ Block<> *block = Block<>::from_usable_space(bytes);
+ if (!block->used())
return nullptr;
- size_t old_size = chunk_block->inner_size();
+ size_t old_size = block->inner_size();
// Do nothing and return ptr if the required memory size is smaller than
// the current size.
@@ -214,15 +192,14 @@ void *FreeListHeap<NUM_BUCKETS>::realloc(void *ptr, size...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine overall, I haven't done a deep review of the logic but I did comment on a few style things.
libc/src/__support/freelist_heap.h
Outdated
BlockType *block = *result; | ||
freelist_.add_chunk(block_to_span(block)); | ||
is_initialized_ = true; | ||
inline void FreeListHeap::init() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline
-> LIBC_INLINE
in multiple places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
libc/src/__support/freetrie.h
Outdated
@@ -0,0 +1,253 @@ | |||
//===-- Interface for freetrie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: formatting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
using LIBC_NAMESPACE::cpp::array; | ||
using LIBC_NAMESPACE::cpp::byte; | ||
using LIBC_NAMESPACE::cpp::span; | ||
namespace LIBC_NAMESPACE_DECL { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In General we try to avoid putting the tests inside of the LIBC_NAMESPACE
to ensure that we're getting our implementation of a function instead of one pulled from the system. I think it's probably fine to do using
here since it's only testing internals, but I'd like to avoid the namespace LIBC_NAMESPACE_DECL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Most of my comments are high level; let me know if you want to discuss more.
libc/src/__support/freelist.h
Outdated
class Node { | ||
public: | ||
/// @returns The block containing this node. | ||
Block<> *block() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally prefer to have a overrides like const Block<>* block() const
and Block<>* block()
in order to better preserve constiness through various calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
public: | ||
// Remove copy/move ctors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend keeping these unless you have a reason not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually makes sense to trivially copy a FreeList or a FreeTrie, since this is a non-owning reference to the data structure. Their semantics are essentially that of a Node pointer.
libc/src/__support/freelist.h
Outdated
// If we get here, we've checked every block in every bucket. There's | ||
// nothing that can support this allocation. | ||
return span<cpp::byte>(); | ||
LIBC_INLINE void FreeList::push(Block<> *block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were previously in the header file because they were templated. Do they need to still be inlined? If so, could you provide some rationale?
The same applies to other previously templated classes as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly it was just expedience. I do think it makes sense to ensure visibility for anything that's overwhelmingly likely to be inlined, particularly anything that's used once, since embedded mallocs are quite often implemented as a single file, and that allows maximal inlining. TU separation is mostly a matter of organizational convenience, since the files involved are fairly small.
Still, I've gone through and broken out functions that are likely not to be inlined when optimizing for size (basically anything not small and used more than once). We can reexamine with specific performance numbers if we need to.
libc/src/__support/freelist.h
Outdated
return true; | ||
LIBC_INLINE void FreeList::push(Node *node) { | ||
if (begin_) { | ||
LIBC_ASSERT(Block<>::from_usable_space(node)->outer_size() == |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, while I'm thinking about it: The code that the Block type was taken from templated for various reasons, but it seems like everything here just uses the defaults. This is probably for another PR, but should Block be made non-templated? It possibly could still be compile-time configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally agree; this is something that I wanted to do as a standalone NFC change.
@@ -1,4 +1,4 @@ | |||
//===-- Interface for freelist_malloc -------------------------------------===// | |||
//===-- Interface for freelist --------------------------------------------===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking: there isn't a reusable intrusive forward list implementation already available, is there? The code this was originally taken is replacing this bespoke implementation with a more generic one, and it might make sense to do the same here if it's available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't find any accessible from libc, but thanks; it was good to check.
libc/src/__support/freetrie.h
Outdated
/// and upper subtries. | ||
class Node : public FreeList::Node { | ||
/// Return an abitrary leaf in the subtrie. | ||
Node &leaf(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debatable: Since this is only needed by FreeTrie::remove
, and FreeTrie
is already a friend, I might suggesting getting rid of this method and moving that code directly into remove
. It's just feels like an odd bit of interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call; there's already a comment in remove
about why a leaf is selected, and the implementation is quite obvious (and just 2 lines) given that context.
@@ -28,23 +30,23 @@ using LIBC_NAMESPACE::freelist_heap; | |||
// made in tests leak and aren't free'd. This is fine for the purposes of this | |||
// test file. | |||
#define TEST_FOR_EACH_ALLOCATOR(TestCase, BufferSize) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is leftover from the code it was taken from, which has several different allocators. Perhaps this macro isn't needed here, and regular test fixtures would work just as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still is trying two variants: the global heap and one instantiated locally. I couldn't immediately think of a way to rewrite it; maybe parameterized tests would work? Not sure whether that would end up more readable though.
libc/src/__support/freelist_heap.h
Outdated
return block->can_allocate(alignment, size); | ||
}); | ||
size_t request_size = | ||
alignment <= alignof(max_align_t) ? size : size + alignment - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible integer overflow here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
libc/src/__support/freelist_heap.h
Outdated
@@ -214,15 +193,14 @@ void *FreeListHeap<NUM_BUCKETS>::realloc(void *ptr, size_t size) { | |||
return new_ptr; | |||
} | |||
|
|||
template <size_t NUM_BUCKETS> | |||
void *FreeListHeap<NUM_BUCKETS>::calloc(size_t num, size_t size) { | |||
LIBC_INLINE void *FreeListHeap::calloc(size_t num, size_t size) { | |||
void *ptr = allocate(num * size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible integer overflow here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
libc/src/__support/freelist_heap.h
Outdated
@@ -214,15 +193,14 @@ void *FreeListHeap<NUM_BUCKETS>::realloc(void *ptr, size_t size) { | |||
return new_ptr; | |||
} | |||
|
|||
template <size_t NUM_BUCKETS> | |||
void *FreeListHeap<NUM_BUCKETS>::calloc(size_t num, size_t size) { | |||
LIBC_INLINE void *FreeListHeap::calloc(size_t num, size_t size) { | |||
void *ptr = allocate(num * size); | |||
if (ptr != nullptr) | |||
LIBC_NAMESPACE::inline_memset(ptr, 0, num * size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'me not at all sure here, but would something like std::fill
that could operate on words instead of bytes be faster here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use libc's optimized memset implementation, so it should work a word at a time whenever appropriate, or even using vector instructions when available.
21815a8
to
f270a29
Compare
f270a29
to
6f40a7f
Compare
I wasn't sufficiently confident that this worked correctly, so I wrote a fuzzer test. Thankfully, this found quite a few issues, so I've fixed those and included the fuzzer. Hooray for property based testing! |
This prevents a conflict with the Linux system endian.h when built in overlay mode for CPP files in __support. This issue appeared in PR llvm#106259.
This prevents a conflict with the Linux system endian.h when built in overlay mode for CPP files in __support. This issue appeared in PR #106259.
6f40a7f
to
5b94391
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments, I didn't notice anything wrong with it but I'm not super confident that I understood it fully.
#include "src/string/memory_utils/inline_memmove.h" | ||
#include "src/string/memory_utils/inline_memset.h" | ||
|
||
using namespace LIBC_NAMESPACE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: generally in tests we recommend not doing using namespace LIBC_NAMESPACE
so it's clear what's coming from LLVM-libc.
This is also useful for differential fuzzing where we compare our internal implementation with the public one. Not a big deal here since everything is coming from LLVM-libc, but something to keep in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done.
LIBC_ASSERT(!block->used() && | ||
"only free blocks can be placed on free lists"); | ||
LIBC_ASSERT(block->inner_size_free() >= sizeof(FreeList) && | ||
"block too small to accomodate free list node"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would making this fail in non-debug mode be useful? This seems like the sort of thing that would be useful to catch at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should never occur at runtime. push
is only called by the FreeTrie
's push
, which has even larger size requirements, and FreeStore
's insert
, which has a check to make it a no-op if the block is too small. Blocks that are too small can never effectively be added to the free store, since there's no way to track them. But their smallness would also mess with the free list selection logic in FreeStore
, so the case is handled at that level (i.e., there's no need to keep track of a freelist that intrinsically must be empty).
128, 256, 512}; | ||
|
||
template <size_t NUM_BUCKETS = DEFAULT_BUCKETS.size()> class FreeListHeap { | ||
class FreeListHeap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like FreelistHeapAlloc
or similar to make it clear this is an allocator. "Heap" is what we call the memory we allocate outside of the stack, but it's also the name of a generic data structure.
libc/src/__support/freetrie.h
Outdated
/// @returns The lower half of the size range. | ||
LIBC_INLINE SizeRange lower() const { return {min, width / 2}; } | ||
|
||
/// @returns The lower half of the size range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: fix this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks.
a0c4c39
to
88e2a17
Compare
AFAICT the buildkite issue is a flake with the new |
This reworks the free store implementation in libc's malloc to use a dlmalloc-style binary trie of circularly linked FIFO free lists. This data structure can be maintained in logarithmic time, but it still permits a relatively small implementation compared to other logarithmic-time ordered maps. The implementation doesn't do the various bitwise tricks or optimizations used in actual dlmalloc; it instead optimizes for (relative) readability and minimum code size. Specific optimization can be added as necessary given future profiling.
These were found almost immediately, and some of them were quite tricky. Hooray for fuzzing!
88e2a17
to
b332573
Compare
The CI flake has been fixed, and it looks green. Friendly ping; anything left to do on this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks!
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/104/builds/10960 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/71/builds/10949 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/131/builds/10797 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/188/builds/7051 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/196/builds/1310 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/183/builds/6530 Here is the relevant piece of the build log for the reference
|
This reworks the free store implementation in libc's malloc to use a dlmalloc-style binary trie of circularly linked FIFO free lists. This data structure can be maintained in logarithmic time, but it still permits a relatively small implementation compared to other logarithmic-time ordered maps.
The implementation doesn't do the various bitwise tricks or optimizations used in actual dlmalloc; it instead optimizes for (relative) readability and minimum code size. Specific optimization can be added as necessary given future profiling.