-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Open
Labels
Description
What happened?
What needs to happen?
The CombinePerKeyPrecombineOperator in the TypeScript SDK currently uses a suboptimal cache eviction strategy. When the cache exceeds maxKeys, it flushes entries in insertion order rather than based on actual usage patterns.
Current behavior (lines 485-488 in operators.ts):
if (this.groups.size > this.maxKeys) {
// Flush a random 10% of the map to make more room.
// TODO: Tune this, or better use LRU or ARC for this cache.
return this.flush(this.maxKeys * 0.9);
}
Proposed change:
Implement LRU (Least Recently Used) cache eviction, similar to what was done for CachingStateProvider in PR #37214. This ensures that frequently accessed keys are kept in cache while infrequently used keys are evicted first.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner