-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance of malloc #185
Comments
First, it should be possible to link in a different malloc implementation on top of libretro, If you define your own malloc, that should replace the libretro one. What system version did you measure on? IIRC Apple rewrote their Memory Manager implementation once or twice ("Modern Memory Manager"). But yes, Directy using a modern allocator for classic Mac OS might not be ideal because the modern world has different tradeoffs. Modern allocators tend to optimize for good multi-threading performance. And if a modern allocator grabs a few megabytes of virtual address space from the OS upon initialization, an old Mac might object to that :-). Parameters will need to be tweaked. Also, one needs to be careful about using two allocators at the same time (imagine I'd start by sticking your own copy of |
Thanks! I'll try |
First, here are some things I tried that didn't work. Below the break, what did work. TL;DR: dlmalloc is hundreds of times faster for me. I tried adding dlmalloc.c to my tester program but it failed at link time because
I tried recompiling libretro without its
I tried replacing malloc.c with dlmalloc.c in libretro's CMakeLists.txt but it failed because other parts of libretro need
I tried following the guidance in gcc/newlib/libc/include/reent.h regarding whether or not to specify Finally, wanting to get something working even if it was messy, I returned to standard Retro68 and used the linker's With dlmalloc, my timing tester program from earlier showed 0 or 1 tick for each test. Trying something a bit more demanding, I added dlmalloc to the app I'm developing using a third-party C++11 library. With dlmalloc, the time taken to have that library parse a test document (in Mini vMac 37.03, Macintosh II emulation, 32⨉ speed (512 MHz), autoslow off, background on, System 7.1) is now 40 ticks (⅔ second) no matter how many documents I open simultaneously. With Retro68's default I told dlmalloc to use a pagesize of 4K which is its default. dlmalloc calls |
In my project I use a third-party C++ library. I create an instance of its main class for each document I open. It in turn creates a bunch of other objects that it manages. (It parses my document into an internal representation.) I noticed that each subsequent document I open takes longer, even if the documents are the same. I believe the problem is that objects are ultimately created using
malloc
, and libretro implementsmalloc
usingNewPtr
, which appears to have poor performance.I created a test program which demonstrates the poor performance of
NewPtr
. This program defines a class. When an object of that class is constructed it creates 150 new pointers, and they're disposed of when the object is destructed. Themain
function times how long it takes to allocate each of 10 objects of this class; the results show that each object takes more time to allocate than the previous one. (There are additional tests in the program which I've commented out. For reasons I don't understand, when the program writes to a file, the performance of those extra tests is much worse than when it writes to the console.)I found a mention in old Mozilla documentation that they have several memory allocator implementations, one of which uses
NewPtr
which can aid in debugging but is slow.Is there a possibility to improve the situation by having libretro use a different implementation of
malloc
, perhaps using jemalloc, mimalloc, or tcmalloc?The text was updated successfully, but these errors were encountered: