-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Description
In #48937 it was found that my gen0 budget was 32MiB. Investigating this further, I believe it may even be as high as 64MiB which causes the erratic Gen0 collection times I'm seeing.
System configuration:
- AMD Ryzen 3950x
I wrote a simple app following the implementations that the GC uses on x86 (as far as I can tell) for both Linux and Windows:
- Linux:
runtime/src/coreclr/gc/unix/gcenv.unix.cpp
Lines 880 to 899 in 1955928
size_t cacheLevel = 0; size_t cacheSize = 0; size_t size; #ifdef _SC_LEVEL1_DCACHE_SIZE size = ( size_t) sysconf(_SC_LEVEL1_DCACHE_SIZE); UPDATE_CACHE_SIZE_AND_LEVEL(size, 1) #endif #ifdef _SC_LEVEL2_CACHE_SIZE size = ( size_t) sysconf(_SC_LEVEL2_CACHE_SIZE); UPDATE_CACHE_SIZE_AND_LEVEL(size, 2) #endif #ifdef _SC_LEVEL3_CACHE_SIZE size = ( size_t) sysconf(_SC_LEVEL3_CACHE_SIZE); UPDATE_CACHE_SIZE_AND_LEVEL(size, 3) #endif #ifdef _SC_LEVEL4_CACHE_SIZE size = ( size_t) sysconf(_SC_LEVEL4_CACHE_SIZE); UPDATE_CACHE_SIZE_AND_LEVEL(size, 4) #endif - Windows:
runtime/src/coreclr/gc/windows/gcenv.windows.cpp
Lines 405 to 435 in 1955928
DWORD nEntries = 0; // Try to use GetLogicalProcessorInformation API and get a valid pointer to the SLPI array if successful. Returns NULL // if API not present or on failure. SYSTEM_LOGICAL_PROCESSOR_INFORMATION *pslpi = GetLPI(&nEntries) ; if (pslpi == NULL) { // GetLogicalProcessorInformation not supported or failed. goto Exit; } // Crack the information. Iterate through all the SLPI array entries for all processors in system. // Will return the greatest of all the processor cache sizes or zero { size_t last_cache_size = 0; for (DWORD i=0; i < nEntries; i++) { if (pslpi[i].Relationship == RelationCache) { if (last_cache_size < pslpi[i].Cache.Size) { last_cache_size = pslpi[i].Cache.Size; cache_level = pslpi[i].Cache.Level; } } } cache_size = last_cache_size; }
... And found that on Windows it outputs a cache size of 16MiB and on Linux of 64MiB.
I believe the 64MiB value to be incorrectly chosen for this system given the CPU topology:

This makes sense, since _SC_LEVEL3_CACHE_SIZE returns the total L3 size.
On other platforms, the GC queries /sys/devices/system/cpu/cpu0/cache/index*/size to determine the cache size: https://github.com/filipnavara/runtime/blob/1955928833e178392f3a40ac1509f0d4a6ca7632/src/coreclr/gc/unix/gcenv.unix.cpp#L901-L935
... Which results in more reasonable values:
$ cat /sys/devices/system/cpu/cpu0/cache/index0/size
32K
$ cat /sys/devices/system/cpu/cpu0/cache/index1/size
32K
$ cat /sys/devices/system/cpu/cpu0/cache/index2/size
512K
$ cat /sys/devices/system/cpu/cpu0/cache/index3/size
16384K
Reproduction Steps
I'm not sure how to extract the Gen0 budget from the GC, so I wrote an app that uses the same method as the GC to determine cache size: https://github.com/smoogipoo/CacheSizeTest
It can be run on Windows and Linux.
Must be run with a multi-CCX CPU such as Ryzen 3950x.
Expected behavior
The cache size on Linux should be 16MiB.
Actual behavior
The cache size on Linux is 64MiB.
Regression?
No response
Known Workarounds
No response
Configuration
No response
Other information
No response