Optimize HLS playlist parsing by caching regex Matchers #2986
+48
−6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduce a per-thread regex Matcher cache in HlsPlaylistParser to significantly reduce object allocation overhead during playlist parsing.
Summary
Cache and reuse regex Matcher objects to prevent OOM and ANR issues during HLS playlist parsing.
Problem:
Each Matcher object allocates memory in BOTH Java heap and native heap, creating two critical issues in production:
Native heap exhaustion: Native allocations are substantial and not subject to normal Java GC pressure. When Matcher objects are created faster than they're garbage collected, the native heap can be exhausted even when Java heap has space available, causing OutOfMemoryError in the native allocator.
GC-induced ANRs: Excessive Matcher allocation causes frequent GC cycles. This is particularly severe with multiple concurrent playback sessions on lower-performance devices, where sustained GC pressure from thousands of short-lived Matcher objects causes Application Not Responding (ANR) events.
Both issues are exacerbated by frequent HLS playlist refreshes (every 2-6 seconds), creating continuous allocation pressure.
Solution:
Performance impact:
(e.g., live TV with large caches or multiple concurrent streams can
generate 20M+ allocations, reduced to ~37K allocations + 19.9M reuses)
Testing: