Skip to content

Optimise CleaningThreadLocal orphan cleanup by caching Thread.isAlive() results#842

Open
peter-lawrey wants to merge 1 commit into
developfrom
adv/optimise-isAlive
Open

Optimise CleaningThreadLocal orphan cleanup by caching Thread.isAlive() results#842
peter-lawrey wants to merge 1 commit into
developfrom
adv/optimise-isAlive

Conversation

@peter-lawrey
Copy link
Copy Markdown
Member

@peter-lawrey peter-lawrey commented Mar 16, 2026

Reduce repeated Thread.isAlive() calls during CleaningThreadLocal orphan cleanup by caching liveness per thread for the duration of a cleanup pass.

Problem

doCleanupNonCleaningThreads() iterates over nonCleaningThreadValues, where multiple entries can belong to the same thread. Before this change, Thread.isAlive() was called for every entry, which can turn cleanup into an effective O(n * m) repetition of a non-trivial/native operation when many values are associated with fewer threads.

In practice this showed up as timeout pressure in tests, with stack traces spending significant time in Thread.isAlive() during orphan tracking cleanup.

Change

This change introduces a per-pass IdentityHashMap<Thread, Boolean> to memoise thread liveness:

  • compute isAlive() at most once per Thread during a cleanup pass
  • reuse the cached result for subsequent entries for the same thread
  • preserve the existing cleanup/removal behaviour for dead threads

Using IdentityHashMap keeps the cache aligned with thread identity semantics.

Why

This improves efficiency by:

  • avoiding repeated native/non-trivial isAlive() calls
  • reducing duplicated work when multiple thread-local values map to the same thread
  • lowering the risk of pathological cleanup costs and related test timeouts

Behavioural impact

No intended functional change.
The cleanup logic is unchanged; this is a performance optimisation within a single cleanup pass.

Motivation

This change is motivated by observed timeout failures such as:

  • TestTimedOutException: test timed out after 60 seconds
  • stack traces repeatedly blocked in java.lang.Thread.isAlive(...)
  • hot path: CleaningThreadLocal.doCleanupNonCleaningThreads(...)

Implementation notes

  • cache scope is limited to one cleanup pass
  • thread liveness is evaluated lazily via computeIfAbsent
  • thread keys are compared by identity, not equality

Expected outcome

More predictable orphan-tracking cleanup performance, especially in cases where many tracked values reference the same set of threads.

@peter-lawrey peter-lawrey requested a review from tgd March 16, 2026 10:29
@sonarqubecloud
Copy link
Copy Markdown

@peter-lawrey peter-lawrey changed the title On some systems, many repeated calls to isAlive were causing timeouts Optimise CleaningThreadLocal orphan cleanup by caching Thread.isAlive() results Mar 16, 2026
return true;

// cache isAlive results so it doesn't become O(N * M) on a costly operation.
Map<Thread, Boolean> isThreadAlive = new IdentityHashMap<>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cache need to be hoisted up a level?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants