Open
Description
openedon Jul 16, 2024
Currently, we make eviction decisions based on access time, and have a "hot set" based on what ends up resident as a result of that atime-driven eviction.
This results in sub-optimal outcomes sometimes:
- A delta that has been covered with an image layer is usually not useful unless there is some branchpoint that can read it: but we keep these layers on disk until their atime hits a threshold anyway.
- Secondary locations download such deltas even though they're not going to be needed when we fail over a tenant to the secondary location.
- We lack a clear metric for how much data we would like to have on disk for a tenant in order to satisfy read performance goals (i.e. fast reads for the data visible to existent branches): resident size can be either an over-estimate or an under-estimate.
I this epic, we add a concept of "visibility" to layers, where visibility means that we might need this layer to service a getpage request. This does not need to be always accurate because it is a heuristic, but it needs to have some properties we can rely on:
- Once a layer is read, we mark it visible
- When a layer is covered during compaction, we update its visibility immediately to make it a priority target for eviction
- Across a restart, we should recover an accurate view of visibility so that we don't do things like thrashing secondary locations' ideas of visible sets
When we implement timeline archival, archived timelines' branchpoints should not contribute to visibility of layers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment