Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to handle upcoming remote wire incompatibility #108

Open
mdedetrich opened this issue Jan 17, 2023 · 17 comments
Open

Provide a way to handle upcoming remote wire incompatibility #108

mdedetrich opened this issue Jan 17, 2023 · 17 comments
Assignees

Comments

@mdedetrich
Copy link
Contributor

mdedetrich commented Jan 17, 2023

As discussed in https://github.com/mdedetrich/akka-apache-project/discussions/28, pekko core will need to handle wire incompatibility issues arising from the change in the akka:// address prefix to pekko://. There may also be other wire incompatibility changes, for example if we decide to change the default 2552 port and anything else that we may have missed.

The solution will also need to play with the release strategy that we decide for pekko. Since the previous discussion I have done some thinking about it and at least to me the most amicable solution would be to make these fields configurable with typesafe config with different default values depend on the pekko release branch, i.e.

For pekko 1.0.x we would have

acceptPrefix = ["akka", "pekko"] // The prefix's to use when checking against incoming requests
sendPrefix = "pekko" // The prefix to use when making send requests

And for pekko 1.1.x (on the assumption that we don't want to accept migrating current Akka clusters to this Pekko branch by default)

acceptPrefix = ["pekko"] // The prefix's to use when checking against incoming requests
sendPrefix = "pekko" // The prefix to use when making send requests

(Note that the same strategy can also be applied to the port 2552`.

This means that when doing a rolling upgrade from an existing akka cluster to pekko 1.0.x you would temporarily change sendPrefix to "akka" (so that currently existing Akka clusters don't refuse connections from new pekko nodes) and once the rolling upgrade is complete then you would change sendPrefix to "pekko". After that point you are then free to upgrade to pekko 1.1.x with acceptPrefix as ["pekko"] without any problems and making this configurable with typesafe config means that if users have bespoke update requirements its relatively easy for them to handle it.

This also leaves the door open to first deprecating the acceptPrefix/sendPrefix settings later down the Pekko 1.1.x release cycle and then at some point in the future hardcoding it as pekko.

@nvollmar
Copy link
Contributor

Can we deserialize protobuf messages sent by an Akka node with Pekko? I'm not too familiar with Protobuf, but akka-kryo-serialization defaults to FQCN for message serialization for example.

@mdedetrich
Copy link
Contributor Author

mdedetrich commented Jan 17, 2023

Can we deserialize protobuf messages sent by an Akka node with Pekko? I'm not too familiar with Protobuf, but akka-kryo-serialization defaults to FQCN for message serialization for example.

From my first impression this should work because the only thing that got changed in the #58 PR is the option java_package in the *.proto files which isn't used by the wire protocol but rather in generating the typed sources from the proto definitions.

This obviously needs to be tested to be verified, it seems like the serialization related protobuff messages may need extra handling due to explicitly using FQCN.

@jrudolph
Copy link
Contributor

Can we deserialize protobuf messages sent by an Akka node with Pekko? I'm not too familiar with Protobuf, but akka-kryo-serialization defaults to FQCN for message serialization for example.

Most (all?) of the internal serialization formats use short custom strings as manifests instead of FQCN, so many serializers should just keep working:

https://github.com/apache/incubator-pekko/blob/4ac0f00a477873965ee7d52e16faefb1de91fe3a/remote/src/main/scala/org/apache/pekko/remote/serialization/MiscMessageSerializer.scala#L313-L320

@mdedetrich
Copy link
Contributor Author

Just relaying what was stated in the mailing list at https://lists.apache.org/thread/nr7q9orzffolwo9tj753cvkrohr9875o, it appears that this issue is not as high priority as I originally anticipated so that means we don't have to worry about this for 1.0.x (or even at all)

@codeGuru775
Copy link

I suppose this would be needed by every large scale live application that uses akka and can not afford a downtime. Most of the large gaming firms use akka .. and they can not afford a downtime. so remote wire compatibility looks like a must have

@mdedetrich
Copy link
Contributor Author

mdedetrich commented Sep 1, 2023

I suppose this would be needed by every large scale live application that uses akka and can not afford a downtime. Most of the large gaming firms use akka .. and they can not afford a downtime. so remote wire compatibility looks like a must have

To be clear we are not against this change and it is something that we planned for originally in Pekko 1.0.0, its just that we didn't have the capacity to do such a change especially considering that it needs to have extensive testing. Our priority was to get the Pekko modules out so the community would at least have something to use.

If the issue is really that critical then it makes sense for someone to own up to it and contribute it and we will do our best to support such an endeavour

@mdedetrich
Copy link
Contributor Author

mdedetrich commented Oct 21, 2023

Assigning this to myself as per #146 (comment). Also removing low priority because I don't think this is accurate (I know for a fact that there are companies that need rolling upgrade and this is blocking them from moving to Pekko).

@mdedetrich
Copy link
Contributor Author

I created a draft PR at #765 which contains a skeleton of the desired implementation, still more to do

@pjfanning
Copy link
Contributor

I wrote a demo app that shows the #765 do allow Pekko actors to send messages to Akka remote actor instances.

https://github.com/pjfanning/pekko-akka-compat

@pjfanning
Copy link
Contributor

pjfanning commented Nov 27, 2023

@mdedetrich @kerr I've added a cluster test to https://github.com/pjfanning/pekko-akka-compat (PekkoAkkaAeronCluster)

The simple remote examples work but the clustering code is more complicated.

The 1st issue that PekkoAkkaAeronCluster runs into is that akka.cluster.JoinConfigCompatCheckCluster checks the compatibility. I am no expert on akka/pekko cluster code but it looks like when akka/pekko nodes try to join a cluster they send a message with HOCON data. Our pekko nodes wrap this data with a pekko wrapper while akka nodes use an akka wrapper.

It seems like there could be 2 solutions.

  • We add a further Pekko change so that the message it sends has an akka wrapper instead of a pekko wrapper.
    • we would also need to change the pekko.cluster.JoinConfigCompatCheckCluster to accept akka wrappers
  • We create a new JoinConfigCompatChecker class that can be used in Pekko and we could release one for Akka
    • users would need to set these classes in their configs

The HOCON message seems to contain at least these 2 configs (or their Akka equivalents).

  private val DowningProviderPath = "pekko.cluster.downing-provider-class"
  private val SbrStrategyPath = "pekko.cluster.split-brain-resolver.active-strategy"

I think one place to start is with pekko.cluster.SeedNodeProcess and adjusting the config in the InitJoin message. If the protocol-name config is set to akka then the config wrapper should be changed to akka and any class names that appear in config values should have org.apache.pekko changed to akka. It might be best to have the original config and the adjusted config sent together just in case we are talking to a seed node that is a Pekko node pretending to be an Akka node.

@mdedetrich
Copy link
Contributor Author

Thanks, ill have a look at it in a few days

@pjfanning
Copy link
Contributor

I had a slightly closer look and am becoming more pessimistic about whether we can support mixed akka/pekko clusters.

Look at this line from the SplitBrainResolver.

context.system.eventStream.subscribe(self, classOf[ThisActorSystemQuarantinedEvent])

ThisActorSystemQuarantinedEvent is in package org.apache.pekko.remote.artery. There is also a similar Akka class but the Pekko SplitBrainResolver will not be listening for Akka ThisActorSystemQuarantinedEvent events - only Pekko ones.

I think if we look around, we'll find this subscribing to the eventStream is common in the cluster code with separate Akka and Pekko classes, I think nodes of different Akka/Pekko type will ignore event stream events of the other type.

@raboof
Copy link
Member

raboof commented Nov 28, 2023

ThisActorSystemQuarantinedEvent is in package org.apache.pekko.remote.artery. There is also a similar Akka class but the Pekko SplitBrainResolver will not be listening for Akka ThisActorSystemQuarantinedEvent events - only Pekko ones.

How are these events serialized between nodes? If they are serialized using protobuf, it may be transparent: the protobuf binary doesn't contain class names, only numeric id's - so as long as the object structure and the numeric ids are consistent a serialized Akka ThisActorSystemQuarantinedEvent might be deserialized into a Pekko one and vice-versa.

@pjfanning
Copy link
Contributor

I have a branch where I have the #765 changes and also an extra change to pekko-cluster jar that hacks the config data that is sent by a cluster node when trying to join the cluster (to make it look like akka node data) - https://github.com/pjfanning/incubator-pekko/commits/remote-release

https://github.com/pjfanning/pekko-akka-compat (PekkoAkkaAeronCluster) gets a bit further when the pekko node tries to join an akka cluster. The latest issue now seems to be the 'version' - Pekko 1.0.1 vs Akka 2.6.21. It looks like I might need to add a new config that let's Pekko pretend that it is running a particular Akka version.

@pjfanning
Copy link
Contributor

I added a further change to get the pekko node to include akka.version in its init-join message - a valid akka.version (not a pekko version). This seems to get the pekko cluster member to join the akka cluster.

@pjfanning
Copy link
Contributor

@mdedetrich what do you think we need to do this move this on? With your #765 and the extra changes in https://github.com/pjfanning/incubator-pekko/commits/remote-release, it appears that we have some support. See note about PekkoAkkaAeronCluster above.

@mdedetrich
Copy link
Contributor Author

@pjfanning I haven't really been ontop if this lately (I am focusing on other stuff) but feel free to add commits onto the #765 branch (it should be open for maintainers) to close it off

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants