In VS we are hitting a NullReferenceException in ProviderNameForGuidImpl for a very small segment of customers. Below Copilot has theorized how this is happening, though I don't think we ever call this concurrently so I don't fully agree. We have a workaround for ourselves so that we don't crash where we just catch the NRE and format based on the provider.
Copilot Analysis
Description
TraceEvent.ProviderName is documented as "never null", but it can throw a NullReferenceException when accessed
concurrently on events sourced from a TraceLog.
We're seeing this in Visual Studio's Profiler profiling sessions.
Stack Trace
System.NullReferenceException: Object reference not set to an instance of an object. at Microsoft.Diagnostics.Tracing.Etlx.TraceLog.ProviderNameForGuidImpl(Guid taskOrProviderGuid) at
Microsoft.Diagnostics.Tracing.TraceEvent.get_ProviderName()
Root Cause Analysis
The call chain is:
TraceEvent.ProviderName getter — calls Source.ProviderNameForGuid(guid) via ITraceParserServices
TraceLog.ProviderNameForGuidImpl — calls AllocLookup(), then
((ITraceParserServices)lookup).ProviderNameForGuid(taskOrProviderGuid)
AllocLookup() — returns a cached freeLookup field and sets it to null
AllocLookup/FreeLookup use a single freeLookup field with no synchronization:
internal unsafe TraceEventDispatcher AllocLookup()
{
if (freeLookup == null)
{
freeLookup = AddAllTemplatesToDispatcher(new TraceLogEventSource(events));
}
TraceEventDispatcher ret = freeLookup;
freeLookup = null; // <-- Thread A sets to null
return ret; // <-- Thread B reads null from freeLookup before new allocation
}
If two threads call ProviderNameForGuidImpl concurrently:
- Thread A enters AllocLookup, reads freeLookup (non-null), sets freeLookup = null, returns the dispatcher
- Thread B enters AllocLookup, sees freeLookup == null, calls AddAllTemplatesToDispatcher — or in a tighter race, reads freeLookup as non-null, but then ret ends up null after the field is cleared by Thread A
The returned null lookup is then cast to ITraceParserServices and called, producing the NullReferenceException.
Expected Behavior
Per the doc comment: ProviderName should never be null and should not throw. For unresolvable GUIDs, the ProviderName getter already has fallback logic to return "Provider(GUID)" or "UnknownProvider" — but this fallback is never reached
because the exception occurs in the ProviderNameForGuidImpl call above it.
Suggested Fix
Make AllocLookup/FreeLookup thread-safe, for example using Interlocked.Exchange:
internal unsafe TraceEventDispatcher AllocLookup()
{
TraceEventDispatcher ret = Interlocked.Exchange(ref freeLookup, null);
if (ret == null)
{
ret = AddAllTemplatesToDispatcher(new TraceLogEventSource(events));
}
return ret;
}
Workaround
We're currently wrapping all TraceEvent.ProviderName accesses in a try-catch:
public static string SafeGetProviderName(this TraceEvent evt)
{
try
{
return evt.ProviderName ?? $"Provider({evt.ProviderGuid})";
}
catch (NullReferenceException)
{
return $"Provider({evt.ProviderGuid})";
}
}
Environment
- TraceEvent NuGet version:
3.1.9
- Host application: Visual Studio (Profiler)
- Reproduction rate: (5 occurrences in last 7 days)
In VS we are hitting a NullReferenceException in ProviderNameForGuidImpl for a very small segment of customers. Below Copilot has theorized how this is happening, though I don't think we ever call this concurrently so I don't fully agree. We have a workaround for ourselves so that we don't crash where we just catch the NRE and format based on the provider.
Copilot Analysis
Description
TraceEvent.ProviderNameis documented as "never null", but it can throw aNullReferenceExceptionwhen accessedconcurrently on events sourced from a
TraceLog.We're seeing this in Visual Studio's Profiler profiling sessions.
Stack Trace
System.NullReferenceException: Object reference not set to an instance of an object. at Microsoft.Diagnostics.Tracing.Etlx.TraceLog.ProviderNameForGuidImpl(Guid taskOrProviderGuid) at
Microsoft.Diagnostics.Tracing.TraceEvent.get_ProviderName()
Root Cause Analysis
The call chain is:
TraceEvent.ProviderNamegetter — callsSource.ProviderNameForGuid(guid)viaITraceParserServicesTraceLog.ProviderNameForGuidImpl— callsAllocLookup(), then((ITraceParserServices)lookup).ProviderNameForGuid(taskOrProviderGuid)AllocLookup()— returns a cachedfreeLookupfield and sets it tonullAllocLookup/FreeLookupuse a singlefreeLookupfield with no synchronization:If two threads call ProviderNameForGuidImpl concurrently:
The returned null lookup is then cast to ITraceParserServices and called, producing the NullReferenceException.
Expected Behavior
Per the doc comment: ProviderName should never be null and should not throw. For unresolvable GUIDs, the ProviderName getter already has fallback logic to return "Provider(GUID)" or "UnknownProvider" — but this fallback is never reached
because the exception occurs in the ProviderNameForGuidImpl call above it.
Suggested Fix
Make AllocLookup/FreeLookup thread-safe, for example using Interlocked.Exchange:
Workaround
We're currently wrapping all TraceEvent.ProviderName accesses in a try-catch:
Environment
3.1.9