-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Open
Labels
Description
What needs to happen?
Problem
The CachingStateProvider in sdks/typescript/src/apache_beam/worker/state.ts currently has unbounded cache growth. The cache can grow indefinitely, which can lead to memory issues in long-running workers. There's a TODO comment indicating this needs to be addressed:
cript
// TODO: (Perf) Cache eviction.## Proposed Solution
Implement LRU (Least Recently Used) cache eviction:
- Add a configurable
maxCacheSizeparameter (default: 1000 entries) - Evict the least recently used entry when the cache reaches capacity
- Maintain LRU order by moving accessed items to the end of the Map
Benefits
- Prevents unbounded memory growth in long-running workers
- Improves reliability for production workloads
- Configurable cache size allows tuning based on use case
Implementation Details
- Use JavaScript Map's insertion order to track access order
- When cache is full, remove the first (oldest) entry
- On cache hit, move entry to end (most recently used)
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner