Skip to content

Support client partition data reassign #1608

Open
@zuston

Description

Motivation

After reviewing #1445 again(partition data reassign, which is disabled by default in the master branch), I found some bugs and design problems. I will use this issue to track the further optimizations.

Subtasks tracking

Design thought

reassign rpc between with spark driver + executor

This part has been involved in the #1445 design doc, I will not describe more.

reassign signal propagation

In current codebase, the latest reassign partition-> servers plan won't be propagated into the next start tasks.
To solve this problem, I will make writer always get the latest partition->servers plan. Once the reassign signal happens,
the cached shuffleHandleInfo will be updated by the reassign rpc returned.

For the next start task(task2) after reassign tasks finished, task2 will get the latest plan according to the replacement + normal servers list. It will avoid writing to the faulty servers again.

reassign multiple servers for one partition

This topic is scoped in the single replica.

For the different type partition, we will have different strategies for the partition -> multiple servers assign.
For huge partition, I will hope that after recogizing the huge_partition, we will request reassign multiple servers by rpc, and the task will acheive its owned partitioned server by the hash mechanism by its taskAttemptId,
which will make load balance valid.

For normal partition, the multiple servers are only valid on the reassign multiple times due to the expected problems.
For this case, the task will always get the last server to write.

image

image

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions