Reduce custom operator behavior in the Volatile<T> structure#124244
Reduce custom operator behavior in the Volatile<T> structure#124244davidwrighton wants to merge 4 commits intodotnet:mainfrom
Conversation
- None of the logic changed here is actually significant. The only actual bug here is the behavior of `Module::SetDebuggerInfoBits`, and `Module::EnableEditAndContinue` which now use interlocked operations for situations that can theoretically race, due to the behavior of Reflection.Emit. - The PAL changes are to remove marking fields which are always protected with a critical section for using volatile - The GC changes are mostly around alloc_context_count which has interesting behavior. It is intentionally written using racy logic which is approximately correct. I'm hesitant to actually change the behavior here without much more extensive testing, but we may want to consider whether or not the increased re-order buffer lengths and whatnot in modern CPUS make it more important to consider using atomic increment/decrement here.
There was a problem hiding this comment.
Pull request overview
This PR reduces reliance on Volatile<T> operator overloads across CoreCLR by removing a set of custom operators and updating call sites to use explicit Load()/Store() patterns or interlocked operations where races are possible (notably around shutdown flags and Reflection.Emit-related module flags).
Changes:
- Removed comparison/arithmetic/bitwise/inc/dec operator overloads from
Volatile<T>(VM and GC variants) to narrow its implicit behavior surface area. - Updated CoreCLR VM, PAL, and GC call sites to replace
++/--/|=/*=-style usage with explicitLoad()/Store()or interlocked operations. - Added/used interlocked helpers for module transient flag updates to avoid lost updates in racy Reflection.Emit/debugger scenarios.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/threads.h | Updates profiler evacuation counters to use explicit Load() and manual RMW assignment. |
| src/coreclr/vm/ceemain.cpp | Switches shutdown flag updates to InterlockedOr for atomic bit setting. |
| src/coreclr/vm/ceeload.h | Adds a masked interlocked transient-flag setter and uses interlocked flag setting for EnC enablement. |
| src/coreclr/vm/ceeload.cpp | Implements masked CAS-based transient-flag update and adjusts flag mutations to avoid removed operators. |
| src/coreclr/pal/src/init/pal.cpp | Replaces init_count++ with explicit Load()/Store() increment. |
| src/coreclr/pal/src/include/pal/synchobjects.hpp | Removes Volatile<> from a per-thread lock count field (relies on thread ownership/critical section behavior). |
| src/coreclr/pal/src/include/pal/synchcache.hpp | Removes Volatile<> from fields protected by a mutex and makes Flush private. |
| src/coreclr/pal/src/include/pal/init.h | Adds documentation about the barrier semantics of reading init_count via PALIsInitialized(). |
| src/coreclr/inc/volatile.h | Removes multiple Volatile<T> operator overloads (comparison, arithmetic/bitwise compound ops, inc/dec, const operator&). |
| src/coreclr/gc/gcpriv.h | Documents alloc_context_count as an approximate, racy heuristic. |
| src/coreclr/gc/gc.cpp | Replaces ++/--/+= usage on VOLATILE(...) fields with explicit expressions compatible with reduced Volatile<T> operators. |
| src/coreclr/gc/env/volatile.h | Mirrors the Volatile<T> operator overload removals for the GC environment version. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
jkotas
left a comment
There was a problem hiding this comment.
LGTM, as an incremental improvement.
BTW: I have discussed the idea of switching to std::atomic with Aaron offline. My take:
For better or worse, Volatile<T> tries to be close to what you get from volatile T in C#.
I like the idea of switching to std::atomic in principle. I am not sure whether it would be an improvement in practice. I think std::atomic has poor usability. It has operator T that does not have barrier that makes it easy to miss the barrier. To do the right thing, you have to use g_someFlag.load(std::memory_order_acquire) and g_someFlag.store(std::memory_order_release, value) in typical case that is very wordy. There are also other details like fragile DAC and how portable std::atomic is.
A compromise may be to keep Volatile<T>, but strip all magic from it and require all places to do explicit VolatileLoad(&g_someFlag) and VolatileStore(&someFlag, value).
|
@jkotas, my opinion on |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| for (;;) | ||
| { | ||
| DWORD dwTransientFlags = m_dwTransientFlags; | ||
| DWORD dwNewTransientFlags = (dwTransientFlags & ~dwMask) | (dwFlag & dwMask); |
There was a problem hiding this comment.
| for (;;) | |
| { | |
| DWORD dwTransientFlags = m_dwTransientFlags; | |
| DWORD dwNewTransientFlags = (dwTransientFlags & ~dwMask) | (dwFlag & dwMask); | |
| _ASSERTE((dwFlag & dwMask) == dwFlag); | |
| for (;;) | |
| { | |
| DWORD dwTransientFlags = m_dwTransientFlags; | |
| DWORD dwNewTransientFlags = (dwTransientFlags & ~dwMask) | dwFlag; |
Nit: Should this better be an assert?
Module::SetDebuggerInfoBits, andModule::EnableEditAndContinuewhich now use interlocked operations for situations that can theoretically race, due to the behavior of Reflection.Emit.These are most of the non-rote changes extracted from #124106, for independent review/merge.