Skip to content

Task prefix collision causes weird behaviour in Computations #7787

@crusaderky

Description

@crusaderky

Use case:

  1. manually submit a task that ends up in task group 1 and wait for it to finish
  2. manually submit a task that ends up in task group 2 and wait for it to finish
  3. manually submit another task that ends up in task group 1 and wait for it to finish

Expected behaviour

Either of the following behaviours is sensible:

  • You end up with two Computation objects, with the activity from (1) and (3) in the first Computation and the activity from (2) in the second Computation
  • You end up with three Computation objects, with the activity from each task attributed to each Computation (e.g. the initial Computation is not disturbed by task 3)

Actual behaviour

Task 3 creates a third Computation object, and then contributes to the first one.

>>> x1 = client.submit(lambda: 1, key=("x-123", 1))
>>> x.result()
>>> y = client.submit(lambda: 1, key="y-456")
>>> y.result()
>>> x2 = client.submit(lambda: 1, key=("x-123", 2))
>>> x2.result()
>>> t0 = s.computations[0].start
>>> [(c.start - t0, c.stop - t0 if c.stop > 0 else c.stop) for c in s.computations]
[(0.0, 24.953980445861816),
 (16.54815936088562, 16.555397033691406),
 (24.947876453399658, -1)]

Impact

This should impact workflows with manually crafted keys only. dask.array, dask.dataframe etc. should be immune from prefix collisions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions