Skip to content

Remove data spilling that acts on target threshold #8820

Open
@fjetter

Description

@fjetter

We currently have two distinct spill-to-disk mechanism

  1. If a task completes and the memory of a worker is above target it write the data to disk immediately, i.e. this is basically a LIFO policy. Note: This functionality is implemented in zict
  2. If during a recurring (i.e. PeriodicCallback) check, the worker memory is above spill we are starting to write data to disk based on a simple LRU policy.

I believe we should get rid of the entire 1.) mechanism, i.e. deprecate the target threshold and disable implict/automatic zict based spilling entirely.

First of all, this mechanism is confusing and I believe very few people actually understand the differences, let alone know which value to increase when and why.

More importantly, though, the first and second mechanism are working on contradicting eviction policies. While it's non-trivial to tell which policy is the best it is not helpful to have two contradicting policies live.

I am suggesting to remove target over spill for a couple of different reasons

  • I think LRU is better than LIFO especially for reducing memory pressure since we hope to process done results as quickly as possible again shortly after it finished
  • The background coroutine can counteract memory pressure while a task is running
  • The target mechanism actually adds significantly to code complexity due to how error handling is implemented (the mutable mapping interface is pretty leaky here, e.g. if serialization fails)
  • The background task decouples us to a large extend from zict which opens the possibilities for much easier asnyc spilling / finer control over what we spill and where (note: there is a ticket somewhere about async disk that already suggests to drop the mutable mapping interface entirely)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprove existing functionality or make things work bettermemoryperformance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions