Skip to content

Conversation

@mbolaris
Copy link

@mbolaris mbolaris commented Jan 5, 2026

Introduce a per-thread regex Matcher cache in HlsPlaylistParser to significantly reduce object allocation overhead during playlist parsing.

Summary

Cache and reuse regex Matcher objects to prevent OOM and ANR issues during HLS playlist parsing.

Problem:
Each Matcher object allocates memory in BOTH Java heap and native heap, creating two critical issues in production:

  1. Native heap exhaustion: Native allocations are substantial and not subject to normal Java GC pressure. When Matcher objects are created faster than they're garbage collected, the native heap can be exhausted even when Java heap has space available, causing OutOfMemoryError in the native allocator.

  2. GC-induced ANRs: Excessive Matcher allocation causes frequent GC cycles. This is particularly severe with multiple concurrent playback sessions on lower-performance devices, where sustained GC pressure from thousands of short-lived Matcher objects causes Application Not Responding (ANR) events.

Both issues are exacerbated by frequent HLS playlist refreshes (every 2-6 seconds), creating continuous allocation pressure.

Solution:

  • Uses ThreadLocal for lock-free per-thread isolation
  • Employs access-ordered LinkedHashMap as LRU cache (max 32 entries)
  • Reuses Matcher objects via reset() instead of creating new instances
  • Eliminates both Java heap AND native heap allocation pressure

Performance impact:

  • Reduces Matcher allocations by >99% in high-demand scenarios
    (e.g., live TV with large caches or multiple concurrent streams can
    generate 20M+ allocations, reduced to ~37K allocations + 19.9M reuses)
  • Eliminates native heap exhaustion risk from Matcher object churn
  • Drastically reduces GC frequency and duration, preventing ANRs
  • Removes dependency on GC timing for native memory reclamation
  • Typical cache occupancy: 6-12 patterns (well under 32 limit)
  • Zero synchronization overhead through thread-local storage

Testing:

  • Validated over 2+ hours with production HLS streams
  • 99.82% reuse rate across 3,692 loader threads
  • No native memory allocation errors observed
  • Significant reduction in GC events during multi-stream playback
  • No functional changes to parsing behavior
  • All existing tests pass

Introduce a per-thread regex Matcher cache in HlsPlaylistParser to
significantly reduce object allocation overhead during playlist parsing.

Problem:
Each Matcher object allocates memory in BOTH Java heap and native heap,
creating two critical issues in production:

1. Native heap exhaustion: Native allocations are substantial and not
   subject to normal Java GC pressure. When Matcher objects are created
   faster than they're garbage collected, the native heap can be
   exhausted even when Java heap has space available, causing
   OutOfMemoryError in the native allocator.

2. GC-induced ANRs: Excessive Matcher allocation causes frequent GC
   cycles. This is particularly severe with our MultiView feature
   (4 concurrent playback sessions) on lower-performance devices, where
   sustained GC pressure from thousands of short-lived Matcher objects
   causes Application Not Responding (ANR) events.

Both issues are exacerbated by frequent HLS playlist refreshes (every
2-6 seconds), creating continuous allocation pressure.

Solution:
- Uses ThreadLocal<MatcherCacheState> for lock-free per-thread isolation
- Employs access-ordered LinkedHashMap as LRU cache (max 32 entries)
- Reuses Matcher objects via reset() instead of creating new instances
- Eliminates both Java heap AND native heap allocation pressure

Performance impact:
- Reduces Matcher allocations by >99% in production workloads
  (from ~20M allocations to ~37K allocations + 19.9M reuses)
- Eliminates native heap exhaustion risk from Matcher object churn
- Drastically reduces GC frequency and duration, preventing ANRs
- Removes dependency on GC timing for native memory reclamation
- Typical cache occupancy: 6-12 patterns (well under 32 limit)
- Zero synchronization overhead through thread-local storage
- Critical for MultiView and lower-performance devices

Testing:
- Validated over 2+ hours with production HLS streams
- 99.82% reuse rate across 3,692 loader threads
- No native memory allocation errors observed
- Significant reduction in GC events during MultiView usage
- No functional changes to parsing behavior
- All existing tests pass
@google-cla
Copy link

google-cla bot commented Jan 5, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@mbolaris
Copy link
Author

mbolaris commented Jan 6, 2026

@googlebot I signed the CLA.

@marcbaechinger marcbaechinger self-requested a review January 6, 2026 11:25
@marcbaechinger marcbaechinger self-assigned this Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants