-
Couldn't load subscription status.
- Fork 77
Description
Problem
Let's say we have 3 nodes A, B and C with the following sequence of events:
Astarts and write a cacheCache.put(:a, 1)Bjoins the cluster. It can still see the cacheCache.all() #=> [:a]Cjoins the cluster. All cache are gone (Cache.all() #=> []) in all nodes.
From what I understand (after reading through the code).
The current design of generational cache has 2 generations at most. e.g. [new, old]. Each time when a new node starts (or joins the cluster) it will always create a new generation first before copying the data (as seen in the code below)
nebulex/lib/nebulex/adapters/replicated.ex
Lines 767 to 770 in b20cd77
| with :ok <- maybe_run_on_nodes(adapter_meta, nodes, :new_generation), | |
| :ok <- copy_entries_from_nodes(adapter_meta, nodes), | |
| :ok <- maybe_run_on_nodes(adapter_meta, [node()], :new_generation) do | |
| maybe_run_on_nodes(adapter_meta, nodes, :reset_generation_timer) |
This seems to cause the issue I mentioned above. When A first started, its generational cache become
[a1] # a1 is holding the cache data `:a => 1`When B joins, A's cache become
[a2, a1] # a2 is the new one, a1 is the oldNow, when C joins, all caches are gone.
[a3, a2] # a1 is gone, including the data `:a => 1` (technically, it's in `deprecated` table)Solution/Suggestion?
I've modified the code above to run copy_entries_from_nodes first whenever a new node joins the cluster and it seemed to fix the issue. Is this the correct way to fix this?