Approach to Data Partitioning

Context

We want to distribute an incoming key value object between a set of workers/nodes, that will process it. This way each objected can be stored, accessed, and managed individually.

Problems

The Mapping Problem: Where Does the Data Go?
The Dynamism Problem: How Do You Handle Change/Failure?
The Distribution Problem : How Do we distribute work?

Initial Approach

The initial approach consisted in hashing the key of the object, creating an hopefully unique idx, this way we cloud perform idx % no_nodes, and the object will get redirected always to the same worker. That's ok, but if we have some kind of failure in one node, the objects keys needed to be recalculated since we are heavily depedent on the number of nodes. So this won't work, even if we recalculate the keys, data from object-1 that was previous in node-x will need to get moved to node-y, since all the nodes will get afected by the failure of one node.

Solution: Ring

The Goal is to have a ring with a fixed number of slots, this way we can solve the dynamism problem. When there is a change in some server/node, if we were distributing, based on the no_nodes, the keys needed to be recalculated as discussed before. In this approach the core information remains on the Ring Hashing slots, so when a server fails its information, in this case lists will get mapped to the next node slot. For example:

We have a ring with 10 slots, we have 3 nodes, say for example node-1, node-2 and node-3. They get assigned to slot 3, 6 and 9, by hashing they id's.
If node-1 fails all the tasks from node-1 will get shifted to node-3. That partly solve the Dynamism Problem.

The mapping Problem is also solved by clock wise distribution, the work load of one node is the idx of that node until the next idx in the ring. Say for example, we have 20 slots and we got nodes on slot 0,10,20. If we got a hashing hit on slot 7, the slot that will handle that will be the slot 0, that goes from [0,10[, node-2 will handle [10,20[, and node-3 will handle [20].

Sub Problems

Just by elaborating the previous section some straight forward problems came up.

How do we handle the distribution of the servers gracefully, we can have a server with a lot of overhead vs a server with no work, in the last example was server 20.
When we have a server failure, all of the work of that server gets transfered to the next one, adding overhead to that server.

Improved Solution: Virtual Node Ring Hashing

This get better, when adding virtual nodes to ring. They are called virtual since it's not a real worker that is on that slot, its just to better distribute ocupation within the ring and prevent not efficient work load distribution.

So each physical node will get some virtual nodes assigned, to it. Lets say worker1-v1, worker1-v2, ...

Visualization

Problems

Sources

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.Partitions.html https://jacobantony.substack.com/p/data-partitioning-in-dynamo

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Approach to Data Partitioning

Context

Problems

Initial Approach

Solution: Ring

Sub Problems

Improved Solution: Virtual Node Ring Hashing

Visualization

Problems

Sources

About

Uh oh!

Releases

Packages

Languages

CarlosSanchess/DataPartitioning

Folders and files

Latest commit

History

Repository files navigation

Approach to Data Partitioning

Context

Problems

Initial Approach

Solution: Ring

Sub Problems

Improved Solution: Virtual Node Ring Hashing

Visualization

Problems

Sources

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages