Open
Description
From @GoogleCodeExporter on April 22, 2015 19:5
I was thinking about how to automatically distribute Swift arrays/ADLB
containers in such a way that we keep small containers on one server, but split
larger containers once they grow. I think we really want this to be
transparent to ADLB clients (unlike the old distributed containers design).
We could use consistent hashing for this - every time the master server for a
container reaches some threshold, increase the number of servers the container
is split between. Readers/writers could be forwarded to the correct server if
they attempt to read on master. This could potentially be a performance boost
if clients were able to cache the number of shards of the container (with hash
table/LRU) to avoid contacting the master.
I think there are some tricky aspects to it. First is managing the migration.
We don't want to block server/worker operations while it's in progress, but we
could potentially just block reads/writes to that particular container while
it's happening.
Second is reference counting. I think we probably want to have the master be
responsible for detecting when the refcount goes to zero... but we don't want
all refcount increments/decrements going to the master (that would eliminate
most of the perf/scalability benefits). We need some kind of scheme. Possibly
when the master has many refcounts it could delegate some of them to each
sub-container. If it's own refcounts went to zero for a container, it would
then have to try to gather additional refcounts from sub-containers.
Original issue reported on code.google.com by tim.g.ar...@gmail.com
on 17 Apr 2015 at 5:03
Copied from original issue: j-woz/exm-issues#795