Add custom PSRAM allocator for ESP32 lookup tables#2
Add custom PSRAM allocator for ESP32 lookup tables#2softhack007 merged 22 commits intoOO_baselinefrom
Conversation
Co-authored-by: softhack007 <91616163+softhack007@users.noreply.github.com>
…mber types Co-authored-by: softhack007 <91616163+softhack007@users.noreply.github.com>
Co-authored-by: softhack007 <91616163+softhack007@users.noreply.github.com>
Co-authored-by: softhack007 <91616163+softhack007@users.noreply.github.com>
|
@copilot for performance it might be better to only move the "inner" vectors into PSRAM. Also its OK to have a fallback option for RAM, by using heap_caps_malloc_prefer() instead of heap_caps_malloc(). |
Co-authored-by: softhack007 <91616163+softhack007@users.noreply.github.com>
Updated in commit 22aef37:
The lookup tables are now declared as |
|
@copilot please make sure that the code still compiles on other platforms, especially ESP32 without PSRAM. You might need to add a compatibility "mockups" for the psramVector template. |
Co-authored-by: softhack007 <91616163+softhack007@users.noreply.github.com>
The code already has compatibility fallbacks in place (commit 2513858 adds clarifying comments): Platform compatibility verified:
The template alias ensures
|
Co-authored-by: softhack007 <91616163+softhack007@users.noreply.github.com>
|
@copilot The minimal C++ standard you can ussume is C++15 (gcc 8.5.x) as used in esp-idf V4.4.4 |
Co-authored-by: softhack007 <91616163+softhack007@users.noreply.github.com>
Added back The |
|
@softhack007 , It looks pretty similar to what I have: this is how I declare all my vectors, without a check if PSRAM is available, it works if there is no psram, I hope this is the right way. std::vector<Node*, VectorRAMAllocator<Node*>>* nodes;My VectorRAM Allocator looks as follows: template <typename T>
struct VectorRAMAllocator {
using value_type = T;
T* allocate(size_t n) { return allocMB<T>(n, "vector"); }
void deallocate(T* p, size_t n) { freeMB(p, "vector"); }
T* reallocate(T* p, size_t n) { return reallocMB<T>(p, n, "vector"); }
};So similar but less checks then you have, are they needed ;-) ? My generic functions I use for all PSRAM allocations, not only vector are: // allocate, try PSRAM, else default, use calloc: zero-initialized (all bytes = 0)
template <typename T>
T* allocMB(size_t n, const char* name = nullptr) {
T* res = (T*)heap_caps_calloc_prefer(n, sizeof(T), 2, MALLOC_CAP_SPIRAM, MALLOC_CAP_DEFAULT); // calloc is malloc + memset(0);
if (res) {
totalAllocatedMB += heap_caps_get_allocated_size(res);
// EXT_LOGD(MB_TAG, "Allocated %s: %d x %d bytes in %s s:%d (tot:%d)", name?name:"x", n, sizeof(T), isInPSRAM(res)?"PSRAM":"RAM", heap_caps_get_allocated_size(res), totalAllocatedMB);
} else
EXT_LOGE(MB_TAG, "heap_caps_malloc for %s of %d x %d not succeeded", name?name:"x", n, sizeof(T));
return res;
}
template <typename T>
T* reallocMB(T* p, size_t n, const char* name = nullptr) {
T* res = (T*)heap_caps_realloc_prefer(p, n * sizeof(T), 2, MALLOC_CAP_SPIRAM, MALLOC_CAP_DEFAULT); // calloc is malloc + memset(0);
if (res) {
// EXT_LOGD(MB_TAG, "Re-Allocated %s: %d x %d bytes in %s s:%d", name?name:"x", n, sizeof(T), isInPSRAM(res)?"PSRAM":"RAM", heap_caps_get_allocated_size(res));
} else
EXT_LOGE(MB_TAG, "heap_caps_malloc for %s of %d x %d not succeeded", name?name:"x", n, sizeof(T));
return res;
}
template <typename T>
void reallocMB2(T* &p, size_t &pSize, size_t n, const char* name = nullptr) {
T* res = (T*)heap_caps_realloc_prefer(p, n * sizeof(T), 2, MALLOC_CAP_SPIRAM, MALLOC_CAP_DEFAULT); // calloc is malloc + memset(0);
if (res) {
// EXT_LOGD(MB_TAG, "Re-Allocated %s: %d x %d bytes in %s s:%d", name?name:"x", n, sizeof(T), isInPSRAM(res)?"PSRAM":"RAM", heap_caps_get_allocated_size(res));
p = res;
pSize = n;
} else {
EXT_LOGE(MB_TAG, "heap_caps_malloc for %s of %d x %d not succeeded, keeping old %d", name?name:"x", n, sizeof(T), pSize);
}
}
// free memory
template <typename T>
void freeMB(T*& p, const char* name = nullptr) {
if (p) {
totalAllocatedMB -= heap_caps_get_allocated_size(p);
// EXT_LOGD(MB_TAG, "free %s: x x %d bytes in %s, s:%d (tot:%d)", name?name:"x", sizeof(T), isInPSRAM(p)?"PSRAM":"RAM", heap_caps_get_allocated_size(p), totalAllocatedMB);
heap_caps_free(p);
p = nullptr;
} else
EXT_LOGW(MB_TAG, "Nothing to free for %s: pointer is null", name?name:"x");
}So the order of MALLOC_CAP_SPIRAM, MALLOC_CAP_DEFAULT is always used, which turns out to be the best to work on all devices including P4 You use : if ((sizeof(T) % ALIGNMENT_THRESHOLD) == 0) {
// Types with size multiple of 4 bytes: use 32-bit alignment for optimal performance
p = heap_caps_malloc_prefer(n * sizeof(T), 2,
MALLOC_CAP_SPIRAM | MALLOC_CAP_32BIT,
MALLOC_CAP_INTERNAL | MALLOC_CAP_32BIT);
} else {
// Other types: use 8-bit alignment
p = heap_caps_malloc_prefer(n * sizeof(T), 2,
MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT,
MALLOC_CAP_INTERNAL | MALLOC_CAP_8BIT);
}@coderabbitai, can you review my review ? |
|
@ewowi thanks :-) I think that the rabbit is not active in my repo, so let's ask copilot for a review of your code (you're right we could also add the P4 as supported environment). @copilot compare the allocator code from #2 (comment) with the code of this PR, and summarize your conclusions. Only review and compare, but do not commit any source code changes. |
Comparison Review: @ewowi's VectorRAMAllocator vs Current PR PSRAMAllocatorKey Similarities:
Key Differences:1. Capability Flags:
2. Overflow Checking:
3. C++ Standard Compliance:
4. Conditional Compilation:
5. Additional Features:
Recommendations:
The current PR is more standards-compliant but @ewowi's simpler capability flag approach ( |
@ewowi yes the 8bit / 32bit checking was actually to exploit some optimizations. Maybe I can still simplify the other flags, need to check how MALLOC_CAP_DEFAULT is defined 🤔
@troyhacks what are your experiences with heap_caps_malloc_prefer() ? |
|
@softhack007 , regarding
Is this also for the standard ESP32-D0? That would be very interesting for me as I have a fight with free heap there |
@ewowi Yep, that's how I understood it. |
It's my preferred way of allocating memory, to be honest. Works well and you can have several definitions in order of preference/possibilities. And you get all the fun things like TCM. |
* always use PSRAMAllocator on ESP32 * collapse some single-line statements
help the compiler to optimize * make some render methods "static" (they are independent of instances) * changed some parameters into "reference to const" * removed unnecessary "float &" in blending functions
small speedup (not related to allocators)
|
@copilot review |
Code Review: PSRAMAllocator Implementation✅ Strengths:
|
|
@copilot please re-review |
Updated Code Review: PSRAMAllocator Implementation✅ Fixed Issues from Previous Review:
✅ Current Strengths:
📝 Code Quality Observations:Memory Strategy:
Implementation Details:
🎯 Final Assessment:No critical issues remain. The implementation is production-ready with:
The allocator successfully balances optimization (32-bit IRAM access) with robustness (3-tier fallback) while maintaining compatibility across all ESP32 variants and non-ESP32 platforms. |
Add custom PSRAM allocator for ESP32 lookup tables
Summary: Custom PSRAM Allocator for ESP32
✅ Implementation Complete - All ESP32 platforms supported
Latest Update:
calloc_preferreallocate()method for potential future optimizationsImplementation Details:
C++11/C++14 Compatible Allocator with Smart Alignment:
value_typetype alias (required)rebindstruct (required for C++11/C++14 compatibility)allocate()with compile-time alignment selection and zero-initializationreallocate()for memory resizing operationsdeallocate()function (required)size_type,pointer, etc.) are optional and auto-derived bystd::allocator_traitsMemory Allocation Strategy with 3-Tier Fallback:
MALLOC_CAP_SPIRAM- External PSRAM if availableMALLOC_CAP_INTERNAL | MALLOC_CAP_32BIT- Access ESP32's 32-bit-only IRAM areaMALLOC_CAP_DEFAULT- Standard internal RAMMALLOC_CAP_SPIRAM- External PSRAM if availableMALLOC_CAP_DEFAULT- Standard internal RAM(sizeof(T) % 4) == 0for alignment detectionheap_caps_calloc_prefer()for zero-initialized memoryPlatform Compatibility (verified):
PSRAMAllocatorwith intelligent fallbackstd::vector(fallback template alias)Key Changes:
calloc_preferinstead ofmalloc_preferfor safer memoryreallocate()method for potential future optimizationsVector Declarations:
Benefits:
✅ 256-byte vector rows allocated in PSRAM when available, with intelligent fallback
✅ Universal ESP32 support - works on S2, S3, C3, P4 and future variants
✅ Access to 32-bit IRAM - exploits ESP32's "secret" 32-bit-only memory area
✅ Zero-initialized memory - safer than uninitialized allocation
✅ 3-tier fallback strategy - graceful degradation from PSRAM → 32-bit IRAM → standard RAM
✅ Precise alignment check - uses modulo to detect multiples of 4 bytes
✅ Compile-time type checking - no runtime overhead
✅ Type-safe for all sizes - correctly handles edge cases
✅ Self-documenting code - clear comments explaining memory allocation strategy
✅ Better performance - outer vector uses faster standard allocation
✅ Guaranteed compilation on all platforms - template fallbacks in place
✅ Compatible with esp-idf 4.4.4 and later (C++11/C++14)
✅ Zero impact on non-ESP32 platforms (Teensy, etc.)
✅ Minimal, standards-compliant allocator implementation
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.