Skip to content

Comments

Add resilience to UI tests for frozen/unresponsive apps#34023

Open
PureWeen wants to merge 33 commits intomainfrom
uitest-resilience-frozen-apps
Open

Add resilience to UI tests for frozen/unresponsive apps#34023
PureWeen wants to merge 33 commits intomainfrom
uitest-resilience-frozen-apps

Conversation

@PureWeen
Copy link
Member

Summary

When an app enters an infinite loop (e.g., layout cycle on iOS), WDA (WebDriverAgent) blocks forever waiting for the main thread to become idle. This causes Appium commands to hang indefinitely, which in turn hangs the entire test run.

This PR adds resilience to prevent test hangs and allows graceful failure with clear error messages.

Changes

HelperExtensions.cs

  • Add RunWithTimeout wrapper (45s hard limit) around all Appium commands
  • Wrap Tap(), Click(), and Wait() query calls with timeout
  • Uses CancellationTokenSource for proper cleanup hygiene
  • Logs warning about orphaned threads when timeout occurs
  • Throws clear TimeoutException when app is unresponsive
  • Uses Task.Run + GetAwaiter().GetResult() to properly unwrap exceptions

AppiumLifecycleActions.cs

  • Add ForceCloseApp command using OS-level termination:
    • iOS: xcrun simctl terminate
    • Android: adb shell am force-stop
    • Mac: osascript quit
  • Wrap CloseApp with 15s timeout, auto-falls back to ForceCloseApp
  • Bypasses WDA when normal Appium termination hangs

UITestBase.cs

  • Catch TimeoutException for unresponsive apps in TearDown
  • Force-terminate and reset session when app freezes
  • Allows subsequent tests to continue instead of hanging forever

Result

Before: Tests hang indefinitely when app freezes, blocking entire CI run

After: Tests fail gracefully after 45s with clear error message:

TimeoutException: An Appium command did not complete within 45s. 
The application may be unresponsive (e.g., due to an infinite layout loop).

Testing

Tested with Issue32586 test case that triggers infinite layout cycle on iOS:

  • Test correctly times out after 45s (previously hung forever)
  • Clear error message indicates app is unresponsive
  • Subsequent tests can continue (session is reset via ForceCloseApp)

Notes

  • 45s timeout balances avoiding false positives on slow CI (normal operations: 10-30s) while still failing reasonably fast when app is truly frozen
  • Thread leak is acceptable: background Task.Run threads may remain blocked after timeout, but ForceCloseApp kills the process which unblocks the socket
  • Debug logging helps diagnose issues in CI

Copilot AI review requested due to automatic review settings February 12, 2026 17:56
@PureWeen PureWeen added the area-ai-agents Copilot CLI agents, agent skills, AI-assisted development label Feb 12, 2026
@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the reliability of the Appium-based UI test infrastructure by preventing indefinite hangs when the app under test becomes frozen/unresponsive (e.g., iOS layout cycles that block WDA), and by adding a force-termination fallback path.

Changes:

  • Add a hard timeout wrapper around select Appium UI interactions and polling queries to fail fast instead of hanging indefinitely.
  • Add a new forceCloseApp command that bypasses Appium/WDA by terminating the app via OS-level commands, and use it as a fallback when closeApp hangs/fails.
  • Update NUnit TearDown handling to attempt force-close/reset when unresponsive timeouts are detected.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
src/TestUtils/src/UITest.NUnit/UITestBase.cs Adds TearDown handling intended to force-close/reset when the app becomes unresponsive.
src/TestUtils/src/UITest.Appium/HelperExtensions.cs Introduces RunWithTimeout and applies it to Tap/Click and to the internal Wait(...) polling query calls.
src/TestUtils/src/UITest.Appium/Actions/AppiumLifecycleActions.cs Adds forceCloseApp and wraps closeApp with a 15s timeout + fallback to force-close.

Comment on lines +2938 to +2942
if (!task.Wait(timeout.Value))
{
// Signal cancellation (Appium driver won't respect it, but good hygiene)
cts.Cancel();

Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On timeout, this code throws while the background task continues running and is never observed. If the underlying Appium call later faults, it can surface as an unobserved task exception (noise/flakiness). Consider attaching a continuation in the timeout path to observe/log task.Exception (OnlyOnFaulted) before throwing.

Copilot uses AI. Check for mistakes.
kubaflo
kubaflo previously approved these changes Feb 13, 2026
@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests, maui-pr-devicetests

@PureWeen PureWeen added this to the .NET 10 SR5 milestone Feb 17, 2026
@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@PureWeen PureWeen moved this from Todo to Approved in MAUI SDK Ongoing Feb 17, 2026
@PureWeen PureWeen force-pushed the uitest-resilience-frozen-apps branch from 60d8eed to 3c5fc28 Compare February 18, 2026 16:53
@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests, maui-pr-devicetests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

kubaflo
kubaflo previously approved these changes Feb 18, 2026
@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests, maui-pr-devicetests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests, maui-pr-devicetests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@PureWeen
Copy link
Member Author

/azp run

@PureWeen
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Relocate from copilot-instructions.md to uitests.instructions.md since
these commands are only relevant in the UI testing context.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

…outs

- Switch from VerticalStackLayout to Grid with star row so RefreshView
  gets remaining screen space. VerticalStackLayout gave ScrollView only
  its content height, preventing Android SwipeRefreshLayout from
  detecting pull-to-refresh gesture.
- Bump Issue1905, Issue3275 catalyst timeouts from 90s to 120s (CI is
  slower than local — 33s local, needs headroom for CI).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests
/azp run maui-pr-devicetests

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@PureWeen
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests, maui-pr-devicetests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

PureWeen and others added 2 commits February 22, 2026 21:18
…e RunWithTimeout from Wait()

- Issue1905: Reduce ListView from 1000 to 20 items, reduce refresh delay from 5s to 1s
- Issue3275: Rewrite as self-verifying test with 50 items (was 500), no multi-page navigation
- Issue3001: Reduce nested Grid from 5 levels (1024 labels) to 3 levels (64 labels)
- Wait(): Remove RunWithTimeout wrapping to match main branch behavior — prevents killing
  slow FindElement calls that the mac2 driver needs for accessibility tree walks.
  Frozen-app protection remains on Tap/Click/GetText actions.

All 8 tests pass on iOS (2-40s each). Catalyst local environment is degraded but
the reduced accessibility trees should resolve CI catalyst failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…bility trees

Issue1905: Rewritten as self-verifying ContentPage. Calls BeginRefresh()
programmatically and checks RefreshCommand completes. Removed 1000-item
ListView that caused mac2 driver to choke on accessibility tree walks.

Issue3001: Reduced nested grid depth from 5 levels (1024 labels) to 3
levels (64 labels). Same test logic: tap Start, verify Ready label appears.

Issue3275: Rewritten as self-verifying ContentPage. Creates a ListView with
RecycleElement caching (50 items instead of 500), performs ScrollTo, nulls
BindingContext to verify no NRE. No navigation needed — the bug is triggered
by ScrollTo + null BindingContext, not by page navigation.

All 3 tests pass on iOS in under 10 seconds (previously 120s+ timeouts).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Issue1905: Restored TestNavigationPage base class and
On<iOS>().SetPrefersLargeTitles(true) — the specific trigger for the bug.
Content page is pushed onto NavigationPage stack as in the original.

Issue3275: Restored navigation lifecycle (PushModalAsync/PopModalAsync),
ContextActions with command bindings on ViewCells, and BindingContext = null
in OnDisappearing. These are all part of the original NRE trigger path.

Issue3001: Increased maxLevel from 3 back to 4 (256 labels instead of 64)
for better sensitivity to performance regressions while still avoiding the
1024-label accessibility tree that caused catalyst timeouts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@PureWeen
Copy link
Member Author

/azp run maui-pr-uitests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Issue3275: Restore NavigationPage + PushAsync + TapBackArrow pattern
to preserve the original navigation lifecycle that triggers the NRE.
MainPage.OnAppearing detects successful return as the verification.

Issue16910: Make OnRunTestClicked async with Task.Yield() to let the
XAML TwoWay binding engine propagate values before checking, instead
of synchronous in-handler check that bypasses the binding pipeline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Issue16910: Restore original Appium-driven verification pattern
(StartRefreshing/StopRefreshing buttons, grid.Remove/grid.Insert for
labels, Appium WaitForElement for visual tree verification). Only
change from original: ScrollView instead of CollectionView to avoid
catalyst accessibility tree bloat.

Issue3001: Fix wrong comment (was '3 levels' for maxLevel=4, now
correctly says '4 levels: 4^4 = 256 labels'). Tighten timeout from
45s to 15s to preserve performance regression detection.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Member Author

/azp run maui-pr, maui-pr-uitests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-ai-agents Copilot CLI agents, agent skills, AI-assisted development

Projects

Status: Approved

Development

Successfully merging this pull request may close these issues.

2 participants