@@ -66,6 +66,60 @@ DWORD SPINLOCKTryAcquire (LONG * lock);
6666// //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
6767// Named mutex
6868
69+ /*
70+ Design
71+
72+ - On systems that support pthread process-shared robust recursive mutexes, they will be used
73+ - On other systems, file locks are used. File locks unfortunately don't have a timeout in the blocking wait call, and I didn't
74+ find any other sync object with a timed wait with the necessary properties, so polling is done for timed waits.
75+
76+ Shared memory files
77+ - Session-scoped mutexes (name not prefixed, or prefixed with Local) go in /tmp/coreclr/shm/session<sessionId>/<mutexName>
78+ - Globally-scoped mutexes (name prefixed with Global) go in /tmp/coreclr/shm/global/<mutexName>
79+ - Contains shared state, and is mmap'ped into the process, see SharedMemorySharedDataHeader and NamedMutexSharedData for data
80+ stored
81+ - Creation and deletion is synchronized using an exclusive file lock on the shm directory
82+ - Any process using the shared memory file holds a shared file lock on the shared memory file
83+ - Upon creation, if the shared memory file already exists, an exclusive file lock is attempted on it, to see if the file data is
84+ valid. If no other processes have the mutex open, the file is reinitialized.
85+ - Upon releasing the last reference to a mutex in a process, it will try to get an exclusive lock on the shared memory file to
86+ see if any other processes have the mutex opened. If not, the file is deleted, along with the session directory if it's empty.
87+ The coreclr and shm directories are not deleted.
88+ - This allows managing the lifetime of mutex state based on active processes that have the mutex open. Depending on how the
89+ process terminated, the file may still be left over in the tmp directory, I haven't found anything that can be done about
90+ that.
91+
92+ Lock files when using file locks:
93+ - In addition to the shared memory file, we need another file for the actual synchronization file lock, since a file lock on the
94+ shared memory file is used for lifetime purposes.
95+ - These files go in /tmp/coreclr/lockfiles/session<sessionId>|global/<mutexName>
96+ - The file is empty, and is only used for file locks
97+
98+ Process data
99+ - See SharedMemoryProcessDataHeader and NamedMutexProcessData for data stored
100+ - Per mutex name, there is only one instance of process data that is ref-counted. They are currently stored in a linked list in
101+ SharedMemoryManager. It should use a hash table, but of the many hash table implementations that are already there, none seem
102+ to be easily usable in the PAL. I'll look into that and will fix later.
103+ - Refers to the associated shared memory, and knows how to clean up both the process data and shared data
104+ - When using file locks for synchronization, a process-local mutex is also created for synchronizing threads, since file locks
105+ are owned at the file descriptor level and there is only one open file descriptor in the process per mutex name. The
106+ process-local mutex is locked around the file lock, so that only one thread per process is ever trying to flock on a given
107+ file descriptor.
108+
109+ Abandon detection
110+ - When a lock is acquired, the process data is added to a linked list on the owning thread
111+ - When a thread exits, the list is walked, each mutex is flagged as abandoned and released
112+ - For detecting process abruptly terminating, pthread robust mutexes give us that. When using file locks, the file lock is
113+ automatically released by the system. Upon acquiring a lock, the lock owner info in the shared memory is checked to see if the
114+ mutex was abandoned.
115+
116+ Miscellaneous
117+ - CreateMutex and OpenMutex both create new handles for each mutex opened. Each handle just refers to the process data header
118+ for the mutex name.
119+ - Some of the above features are already available in the PAL, but not quite in a way that I can use for this purpose. The
120+ existing shared memory, naming, and waiting infrastructure is not suitable for this purpose, and is not used.
121+ */
122+
69123// Temporarily disabling usage of pthread process-shared mutexes on ARM/ARM64 due to functional issues that cannot easily be
70124// detected with code due to hangs. See https://github.com/dotnet/runtime/issues/6014.
71125#if HAVE_FULLY_FEATURED_PTHREAD_MUTEXES && HAVE_FUNCTIONAL_PTHREAD_ROBUST_MUTEXES && !(defined(HOST_ARM) || defined(HOST_ARM64) || defined(__FreeBSD__))
0 commit comments