Skip to content

Comments

feat: intelligent test failure diagnosis and categorization#4953

Merged
thomhurst merged 2 commits intomainfrom
feat/failure-diagnosis
Feb 19, 2026
Merged

feat: intelligent test failure diagnosis and categorization#4953
thomhurst merged 2 commits intomainfrom
feat/failure-diagnosis

Conversation

@thomhurst
Copy link
Owner

Closes #4895. Adds FailureCategorizer with 7 failure categories.

@claude
Copy link
Contributor

claude bot commented Feb 19, 2026

Code Review

Two confirmed bugs were found, both rooted in the same architectural issue.

Root Cause: The Categorizer Result Is Not Used for Routing

The PR introduces FailureCategorizer.Categorize(e) to classify exceptions, but GetFailureStateProperty computes the category and then ignores it for branching — it continues to use the old independent type-checking logic. This creates two separate classification systems that can disagree:

private static TestNodeStateProperty GetFailureStateProperty(TestContext testContext, Exception e, TimeSpan duration)
{
var category = FailureCategorizer.Categorize(e);
var categoryLabel = FailureCategorizer.GetLabel(category);
if (testContext.Metadata.TestDetails.Timeout != null
&& e is TaskCanceledException or OperationCanceledException or TimeoutException
&& duration >= testContext.Metadata.TestDetails.Timeout.Value)
{
return new TimeoutTestNodeStateProperty($"[{categoryLabel}] Test timed out after {testContext.Metadata.TestDetails.Timeout.Value.TotalMilliseconds}ms");
}
if (e.GetType().Name.Contains("Assertion", StringComparison.InvariantCulture))
{
return new FailedTestNodeStateProperty(e, $"[{categoryLabel}] {e.Message}");
}
return new ErrorTestNodeStateProperty(e, $"[{categoryLabel}] {e.Message}");
}


Bug 1: AggregateException Unwrapping Mismatch

FailureCategorizer.Categorize() unwraps AggregateException to inspect the first inner exception:

{
// Unwrap AggregateException to get the real cause
var ex = exception is AggregateException { InnerExceptions.Count: > 0 } agg
? agg.InnerExceptions[0]
: exception;

But GetFailureStateProperty checks the original unwrapped e directly:

if (testContext.Metadata.TestDetails.Timeout != null
&& e is TaskCanceledException or OperationCanceledException or TimeoutException
&& duration >= testContext.Metadata.TestDetails.Timeout.Value)
{
return new TimeoutTestNodeStateProperty($"[{categoryLabel}] Test timed out after {testContext.Metadata.TestDetails.Timeout.Value.TotalMilliseconds}ms");

Concrete failure scenario: e = AggregateException wrapping TaskCanceledException

  • Categorize(e) → returns Timeout, label = "[Timeout]"
  • e is TaskCanceledException or OperationCanceledException or TimeoutExceptionFALSE (e is still AggregateException)
  • Falls through to ErrorTestNodeStateProperty with message "[Timeout] One or more errors occurred."
  • Result: misleading label "Timeout" on an Error node with the wrong message

Bug 2: Assertion Detection Mismatch

The categorizer matches exception type names containing "Assertion" or "Assert":

// Assertion failures - check by type name to support third-party assertion libraries
if (ex.GetType().Name.Contains("Assertion", StringComparison.Ordinal)
|| ex.GetType().Name.Contains("Assert", StringComparison.Ordinal))
{
return FailureCategory.Assertion;

But GetFailureStateProperty only checks for "Assertion" (not "Assert"):

if (e.GetType().Name.Contains("Assertion", StringComparison.InvariantCulture))
{
return new FailedTestNodeStateProperty(e, $"[{categoryLabel}] {e.Message}");
}

Concrete failure scenario: Exception type AssertFailedException (MSTest), AssertInconclusiveException, or any custom AssertXxx exception:

  • Categorize(e) → matches "Assert", returns Assertion, label = "[Assertion Failure]"
  • e.GetType().Name.Contains("Assertion")FALSE ("AssertFailed" doesn't contain "Assertion")
  • Falls through to ErrorTestNodeStateProperty
  • Result: label says "Assertion Failure" but node state is Error (not Failed), breaking downstream reporters that pattern-match on node state type

Recommended Fix

Use the category enum to drive the branching instead of duplicating type checks. This eliminates both bugs at once since the categorizer is already handling the complex detection logic (including AggregateException unwrapping and the broader assertion detection):

private static TestNodeStateProperty GetFailureStateProperty(TestContext testContext, Exception e, TimeSpan duration)
{
    var category = FailureCategorizer.Categorize(e);
    var categoryLabel = FailureCategorizer.GetLabel(category);

    if (category == FailureCategory.Timeout
        && testContext.Metadata.TestDetails.Timeout != null
        && duration >= testContext.Metadata.TestDetails.Timeout.Value)
    {
        return new TimeoutTestNodeStateProperty($"[{categoryLabel}] Test timed out after {testContext.Metadata.TestDetails.Timeout.Value.TotalMilliseconds}ms");
    }

    if (category == FailureCategory.Assertion)
    {
        return new FailedTestNodeStateProperty(e, $"[{categoryLabel}] {e.Message}");
    }

    return new ErrorTestNodeStateProperty(e, $"[{categoryLabel}] {e.Message}");
}

This approach is more maintainable and scalable: adding a new category in the future only requires updating FailureCategorizer, not GetFailureStateProperty as well.

Automatically categorize test failures to help users quickly understand
what went wrong. Categories include: Assertion, Timeout, NullReference,
Setup (Before hooks), Teardown (After hooks), Infrastructure (IO/network),
and Unknown. The category label is prefixed in the failure explanation
visible in test reports.
…n at top of GetFailureStateProperty

The method was computing a category via FailureCategorizer but then
ignoring it for routing, using separate type-checking logic against the
original (potentially wrapped) exception. This caused two bugs:

1. AggregateException wrapping e.g. TaskCanceledException would be
   correctly categorized as Timeout by the categorizer, but the method's
   own `e is TaskCanceledException` check would fail since `e` was still
   the AggregateException.

2. The assertion check used a different string comparison
   (InvariantCulture) than the categorizer (Ordinal) and only checked
   for "Assertion" rather than also checking "Assert".

Now AggregateException is unwrapped once at the top, the unwrapped
exception is passed to the categorizer, and the FailureCategory enum
drives all routing and is used for constructing state properties.
This was referenced Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: intelligent test failure diagnosis and categorization

1 participant