Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
1.5.27 July 25th 2024
Akka.NET v1.5.27 is a significant release that includes new features, mission-critical bug fixes, and some performance improvements.
Major Akka.Cluster.Sharding and Akka.Cluster.Tools.Singleton Bug Fixes
In all prior versions of Akka.NET, there are two high impact distributed systems bugs:
As we discovered during the course of our pains-taking bug investigation, these were, in fact, the same issue:
ClusterSingletonManager
is supposed to always belong on the oldest node of a given role type, but an original design error from the time Akka.Cluster.Tools was first introduced to Akka.NET meant that nodes were always sorted in descending order ofUpNumber
. This is backwards: nodes should always be sorted in ascending order ofUpNumber
- this means that the oldest possible node is always at the front of the "who is oldest?" list held by theClusterSingletonManager
. This explains why the singleton could appear to move early during deployments and restarts.ClusterSingletonManager
was suspectible to a race condition where if nodes were shutdown and restarted with the same address in under 20 seconds, the default "down removal margin" used by theClusterSingletonManager
to tolerate dirty exits, it would be possible after multiple successive, fast, restarts for multiple instances of the singleton to be alive at the same time (for a short period.)Both of these varieties of problem, duplicate singletons, is what lead to duplicate shards.
As a result we've made the following fixes:
AppVersion
is no longer considered for singleton placement as it could easily result in split brains.ClusterSingletonManager
- resolves the issue with rapid rolling restarts creating duplicates. We've tested this fix in our test lab across thousands of coordinator restarts and haven't been able to reproduce the issue since (we could easily do it before.)ClusterSingletonManager
HandOver
- we fixed the member age problem here, which could cause a second singleton to start at inappropriate times.Akka.Discovery and
ClusterClient
Discovery SupportIn Akka.NET v1.5.27 we've added support for using Akka.Cluster.Tools.ClusterClient alongside with Akka.Discovery plugins to automatically discover the initial contacts you need for
ClusterClientReceptionist
instances in your environment.You can read the documentation for how this works here: https://getakka.net/articles/clustering/cluster-client.html#contact-auto-discovery-using-akkadiscovery
Related PRs and issues:
Other Bug Fixes and Improvements
ActorMaterializerImpl
null
LogSource
AlsoTo
may not be failing graph when its sink throws exceptionlmdb.dir
is null or empty, log a warning and set to defaultTo see the full set of changes in Akka.NET v1.5.27, click here.