Optimize HLS playlist parsing by caching regex Matchers #2986

mbolaris · 2026-01-05T22:46:28Z

Introduce a per-thread regex Matcher cache in HlsPlaylistParser to significantly reduce object allocation overhead during playlist parsing.

Summary

Cache and reuse regex Matcher objects to prevent OOM and ANR issues during HLS playlist parsing.

Problem:
Each Matcher object allocates memory in BOTH Java heap and native heap, creating two critical issues in production:

Native heap exhaustion: Native allocations are substantial and not subject to normal Java GC pressure. When Matcher objects are created faster than they're garbage collected, the native heap can be exhausted even when Java heap has space available, causing OutOfMemoryError in the native allocator.
GC-induced ANRs: Excessive Matcher allocation causes frequent GC cycles. This is particularly severe with multiple concurrent playback sessions on lower-performance devices, where sustained GC pressure from thousands of short-lived Matcher objects causes Application Not Responding (ANR) events.

Both issues are exacerbated by frequent HLS playlist refreshes (every 2-6 seconds), creating continuous allocation pressure.

Solution:

Uses ThreadLocal for lock-free per-thread isolation
Employs access-ordered LinkedHashMap as LRU cache (max 32 entries)
Reuses Matcher objects via reset() instead of creating new instances
Eliminates both Java heap AND native heap allocation pressure

Performance impact:

Reduces Matcher allocations by >99% in high-demand scenarios
(e.g., live TV with large caches or multiple concurrent streams can
generate 20M+ allocations, reduced to ~37K allocations + 19.9M reuses)
Eliminates native heap exhaustion risk from Matcher object churn
Drastically reduces GC frequency and duration, preventing ANRs
Removes dependency on GC timing for native memory reclamation
Typical cache occupancy: 6-12 patterns (well under 32 limit)
Zero synchronization overhead through thread-local storage

Testing:

Validated over 2+ hours with production HLS streams
99.82% reuse rate across 3,692 loader threads
No native memory allocation errors observed
Significant reduction in GC events during multi-stream playback
No functional changes to parsing behavior
All existing tests pass

Introduce a per-thread regex Matcher cache in HlsPlaylistParser to significantly reduce object allocation overhead during playlist parsing. Problem: Each Matcher object allocates memory in BOTH Java heap and native heap, creating two critical issues in production: 1. Native heap exhaustion: Native allocations are substantial and not subject to normal Java GC pressure. When Matcher objects are created faster than they're garbage collected, the native heap can be exhausted even when Java heap has space available, causing OutOfMemoryError in the native allocator. 2. GC-induced ANRs: Excessive Matcher allocation causes frequent GC cycles. This is particularly severe with our MultiView feature (4 concurrent playback sessions) on lower-performance devices, where sustained GC pressure from thousands of short-lived Matcher objects causes Application Not Responding (ANR) events. Both issues are exacerbated by frequent HLS playlist refreshes (every 2-6 seconds), creating continuous allocation pressure. Solution: - Uses ThreadLocal<MatcherCacheState> for lock-free per-thread isolation - Employs access-ordered LinkedHashMap as LRU cache (max 32 entries) - Reuses Matcher objects via reset() instead of creating new instances - Eliminates both Java heap AND native heap allocation pressure Performance impact: - Reduces Matcher allocations by >99% in production workloads (from ~20M allocations to ~37K allocations + 19.9M reuses) - Eliminates native heap exhaustion risk from Matcher object churn - Drastically reduces GC frequency and duration, preventing ANRs - Removes dependency on GC timing for native memory reclamation - Typical cache occupancy: 6-12 patterns (well under 32 limit) - Zero synchronization overhead through thread-local storage - Critical for MultiView and lower-performance devices Testing: - Validated over 2+ hours with production HLS streams - 99.82% reuse rate across 3,692 loader threads - No native memory allocation errors observed - Significant reduction in GC events during MultiView usage - No functional changes to parsing behavior - All existing tests pass

google-cla · 2026-01-05T22:46:33Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

mbolaris · 2026-01-06T08:01:46Z

@googlebot I signed the CLA.

marcbaechinger self-requested a review January 6, 2026 11:25

marcbaechinger self-assigned this Jan 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize HLS playlist parsing by caching regex Matchers #2986

Optimize HLS playlist parsing by caching regex Matchers #2986

mbolaris commented Jan 5, 2026

Uh oh!

google-cla bot commented Jan 5, 2026

Uh oh!

mbolaris commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize HLS playlist parsing by caching regex Matchers #2986

Are you sure you want to change the base?

Optimize HLS playlist parsing by caching regex Matchers #2986

Conversation

mbolaris commented Jan 5, 2026

Summary

Uh oh!

google-cla bot commented Jan 5, 2026

Uh oh!

mbolaris commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants