-
Notifications
You must be signed in to change notification settings - Fork 431
Description
Current Behavior
I am testing Memray's capabilities to identify memory leaks in python programs that use C extensions. I have a program that uses a library, which has a known memory leak coming from an extension code, and I am running Memray in several configurations to find the one shows the leak most prominently.
When I run my program with memray --native mode for 1 hr, my profile file is about 2GB without using aggregated mode or 6MB in aggregated mode (yay for this feature!).
When I create a table profile, it has about 25k rows, yet many of the rows are duplicates. I have converted the html report to a CSV, file and then did further aggregation to remove duplicates by adding together allocations that happened in the same place.
This reduced the number of rows to ~200; when I removed the Thread_ID column and aggregated the entries again, I ended up with only 50 rows, and had a meaningful information about my leak.
Here are sample entries from the table that demonstrate duplication
Thread_ID,Size,Allocator,Allocations,Location
0x1,192,malloc,1,operator new(unsigned long) at <unknown>:0
0x1,56,malloc,1,operator new(unsigned long) at <unknown>:0
...
0x1,944,malloc,1,_PyMem_RawMalloc at Objects/obmalloc.c:99
0x1,944,malloc,1,_PyMem_RawMalloc at Objects/obmalloc.c:99
...
0x25 (fn_api_status_handler),328,realloc,1,upb_Arena_InitSlow at <unknown>:0
0x38 (Thread-30 (_run)),328,realloc,1,upb_Arena_InitSlow at <unknown>:0
0x1c (Thread-6 (_run)),328,realloc,1,upb_Arena_InitSlow at <unknown>:0
...
0x25 (fn_api_status_handler),71,malloc,1,<unknown> at <unknown>:0
0x25 (fn_api_status_handler),87,malloc,1,<unknown> at <unknown>:0
...
In my case the application is creating several ephemeral threads, and thread ID info is not meaningful, it would be nice to have an option to exclude ThreadID from the report. But even that aside , it seems that we should be adding together allocations that are happening at the same location, increasing the Size and Total allocations instead.
I have also tried using --trace-python-allocation, which didn't have a meaningful impact.
I also verified that duplicate entries appear in table report when not using --native mode. It happens but at a smaller ratio: post-processing resulted in 3x reduction in number of rows.
Expected Behavior
In Table report, there should be only 1 row for each (Thread_ID, Size, Allocator, Location) tuple.
Bonus: provide the option to exclude ThreadID from the report.
Steps To Reproduce
My setup is somewhat involved, I am profiling a data processing pipeline, also discussed in #852.
I am happy to attach the collected profiles if it helps. If we must have a repro for a meaningful investigation, let me know.
Memray Version
1.19.1
Python Version
3.10
Operating System
Linux
Anything else?
No response