mem_profile is an ownership-aware memory profiler, designed for direct measurement of objects and their members.
What does this mean?
When a program is executed with mem_profile, we can see the number of bytes allocated by types within the program.
Above is some sample output, produced by the analyzer. This output comes from a
profile of examples/src/objects.cpp.
This is a toy example, but mem_profile is designed for use with complex workloads, and has been tested on projects involving hundreds or even thousands of translation units.
For instance, here's an output of type-level stats from profiling an FTXUI demo application:
And here's output from profiling CMake:
mem_profile has three components:
- A compiler plugin for Clang, which embeds type information in the binary and annotates destructor calls.
- The profiler runtime, which tracks allocations and deallocations, and stores the output to a data file.
- An analyzer, which allows querying the resulting datafile for type information.
flowchart TD
    mem_profile --> plugin
    mem_profile --> profiler
    mem_profile --> analyzer
    plugin["Clang Plugin (libmp_plugin)"]
    profiler["profiler (libmp_runtime)"]
    analyzer["analyzer (mp_reader)"]
    The output data file is named malloc_stats.json, however this can be
configured with the MEM_PROFILE_OUT environment variable:
env MEM_PROFILE_OUT=my_stats.json ...
mem_profile can be built with cmake:
cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release
cmake --build build
cmake --install build --prefix ./installIf the build process complains that it cannot find clang, you may need to
provide a path to clang via CMAKE_INSTALL_PREFIX:
cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/path/to/clang
cmake --build build
cmake --install build --prefix ./installIn this example, I just installed it in a folder ./install in the current
directory, however you may install it at whatever location is convenient (eg,
/usr/local).
mp_reader is currently implemented
in python, and it can be installed with either pip or pipx.
If available, I recommend pipx, which will install it as a standalone
executable that you can run, independent of existing python environments.
pipx install mp_readerThe simplest way to get started is to build and run an example manually. There is a quick and easy way to do this process automatically from CMake, however for users wishing to build from something other than CMake, it's good to understand the process.
These steps are:
- Build your program with the plugin. To do this, you need to add two flags to
your build: -fplugin=/path/to/plugin, and--include=/path/to/mp_hook_prelude.h.
- Execute our program using LD_PRELOADto load the runtime
- Use mp_readerto query the resulting data file
clang++ -fplugin=install/lib/libmp_plugin.so    \
    --include=install/include/mp_hook_prelude.h \
    -Og examples/src/objects.cpp -o objects
env LD_PRELOAD=install/lib/libmp_runtime.so ./objects
mp_reader stats malloc_stats.jsonOn macos, you will need to change .so to .dylib, and LD_PRELOAD will
become DYLD_INSERT_LIBRARIES:
/opt/homebrew/opt/llvm/bin/clang++ \
    -fplugin=install/lib/libmp_plugin.dylib \
    --include=install/include/mp_hook_prelude.h \
    -Og examples/src/objects.cpp -o objects
env DYLD_INSERT_LIBRARIES=install/lib/libmp_runtime.dylib ./objects
mp_reader stats malloc_stats.jsonBecause of the annotations inserted by the compiler plugin, you can build with compiler optimizations enabled.
If you care about getting stack traces, I recommend -Og, however, the profiler
will work even at -O3.
Once you've installed mem_profile, you will be able to use mem_profile from
CMake. This simplifies the process of building with the plugin, and enables use
of mem_profile in complex projects.
Add these lines to your CMake:
# If you do this, I recommend putting it behind a flag
# (unless doing it as a one-off)
if(BUILD_WITH_MEM_PROFILE)
   find_package(mem_profile REQUIRED)
   link_libraries(mp::mp_build_with_plugin)
endif()Running the profiler remains the same:
# linux
env LD_PRELOAD=/usr/local/lib/libmp_runtime.so /path/to/your/application
# macos
env DYLD_INSERT_LIBRARIES=/usr/local/lib/libmp_runtime.dylib /path/to/your/applicationAs before, the profiler will output malloc_stats.json, which you can analyze
with mp_reader.
Because the compiler plugin is written for LLVM clang, ensure that you are building with clang.
It is also possible to build using the plugin, without modifying the
CMakeLists.txt. To do this, add the compiler flags discussed in "Running The
Profiler Manually". For example, if you've installed the plugin to
/usr/local/lib, you can do this with
-DCMAKE_CXX_FLAGS="-fplugin=/usr/local/lib/libmp_plugin.so --include=/usr/local/include/mp_hook_prelude.h -Og"(Or .dylib for macos).
Ensure that you are building with clang.
mem_profile can measure the memory usage of a lambda. Because lambdas have no named fields, all fields are unnamed.
using bytes   = std::vector<std::byte>;
using floats  = std::vector<float>;
using doubles = std::vector<double>;
int main() {
    auto myLambda = [a = bytes(10),   //
                     b = floats(100), //
                     c = doubles(1000)]() {};
}mem_profile can tell you how much memory is allocated over the lifetime of a program by each member of a c-style array, in a parent type:
using bytes = std::vector<std::byte>;
struct array3 {
    bytes elems[3];
};
int main() {
    array3 values{
        bytes(10),
        bytes(100),
        bytes(1000),
    };
}Examples - beman project - inplace vector
mem_profile can measure allocations in members allocated inside an
inplace_vector (arriving in C++26):
using bytes = std::vector<std::byte>;
int main() {
    beman::inplace_vector<bytes, 100> v;
    v.push_back(bytes(10));
    v.push_back(bytes(20));
    v.push_back(bytes());
    v.push_back(bytes(100));
}As v[2] didn't own any memory, it is omitted during output by mp_reader.
Breakdown by member for cmState, which holds a significant portion of the
state for a running instance of CMake. This profile was taken when running the
configuration step.
Examples - FTXUI
When profiling ftxui_example_html_like, we can see that ftxui::Screen
inherits from ftxui::Image, and almost off of the memory it owns belongs to
ftxui::Image:
All the instances of ftxui::Node are instances of ftxui::Border, which
inherits from Node:
// For reference, here is the charset for normal border:
class Border : public Node {
 public:
  Border(Elements children,
         BorderStyle style,
         std::optional<Color> foreground_color = std::nullopt)
    // ...
};Most of the memory of ftxui::flexbox_helper::Global is taken up by it's
blocks:
It is my hope that one day, running mem_profile will be quick, easy, and
painless, although at the current moment this project is young, and may contain
sharp edges.
This section will describe common issues encountered when running the profiler; the potential causes; and how to diagnose and fix these issues.
If no malloc_stats.json appears after running your program with the plugin,
ensure that the program exited gracefully.
If you use Ctrl+C to kill a program (or your program segfaults, or
std::terminate is called), no cleanup will occur and mem_profile will not have
the opportunity to generate malloc_stats.json.
You may sometimes get a malloc_stats.json with no type information (only
stacktraces).
Allocation Statistics by Type:
    9,614 allocs    3,072,024 bytes  <untyped>
       0 objects    3,860,984 bytes  <total>
If this occurs, there are a couple causes:
- 
You ran the profiler on a program that wasn't built with the plugin. (eg, if you run the profiler on a prebuilt program, there are no type annotations) 
- 
[MacOS only] You built with AppleClang, instead of LLVM clang. UNFORTUNATELY, AppleClang will silently ignore the plugin. If this occurs, clear the build directory, and build again with LLVM Clang: env CC=/opt/homebrew/opt/llvm/bin/clang CXX=/opt/homebrew/opt/llvm/bin/clang++ cmake -B build -G Ninja ...
- 
[MacOS/ARM platforms] You built with no optimization. The compiler plugin works by injecting annotations / type information at the beginning of the destructor stack frame. However, with -O0, the stack sometimes becomes so polluted that the profiler cannot find destructor annotations at runtime. (For performance reasons, it only searches the first couple entries in the stackframe for large frames, rather than scanning the entire frame).I have only observed this issue on MacOS, but it's possible that it also exists on ARM linux. I recommend enabling at least -O1or-Ogin order to fix it.
In any case, ensure that you are using clang, and that you are building with the mem_profile compiler plugin.
If you set env MEM_PROFILE_PRINT_BODY=1 when building, the plugin will display
the body of any destructor calls which it rewrites.
This will allow you to confirm that the plugin is running, and working as intended.
cmake --build build --target clean
env MEM_PROFILE_PRINT_BODY=1 cmake --build buildThis tends to be rather verbose. You can limit output to just the names of
rewritten destructors using MEM_PROFILE_PRINT_NAME=1:
cmake --build build --target clean
env MEM_PROFILE_PRINT_NAME=1 cmake --build buildSee the issues page for a full list of issues people have filed.
Three significant issues which I am currently in the process of resolving are:











