Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARTEMIS-5037: option to limit mirror propagation #5220

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lavocatt
Copy link

@lavocatt lavocatt commented Sep 6, 2024

Add a new option in the Mirror settings to prevent a broker from
propagating messages.

When working with a topology where 4 nodes are forming a square and
where each node in that square mirrors its two neighbors: a message
leaving a corner can reach the opposite corner of the square by two
different routes. This is causing the message ordering to get broken.

example:

1 <-> 2
^     ^
|     |
v     v
4 <-> 3

A message from 1 will reach 3 by 2 and 4. Message duplication checks
will prevent the message from being duplicated but won't help regarding
the order of the messages. This is because a either the route by 2 or 4
can be faster than the other, so whomever wins the race sets the message
first.

Fixing the example:
Using the new option to not forward messages coming from a link, we
break the possibilities to have two routes to reach the opposite corner.

The above example is updated as followed:

  • 2 never forwards messages coming from 1
  • 1 never forwards messages coming from 2
  • 3 never forwards messages coming from 4
  • 4 never forwards messages coming from 3

Now, when a messages leaves 1:

  • it reaches 2 and stops there
  • it reaches 4
  • it reaches 3 through 4 and stops there

Now, when a messages leaves 2:

  • it reaches 1 and stops there
  • it reaches 3
  • it reaches 4 through 3 and stops there

Now, when a messages leaves 3:

  • it reaches 4 and stops there
  • it reaches 2
  • it reaches 1 through 2 and stops there

Now, when a messages leaves 4:

  • it reaches 3 and stops there
  • it reaches 1
  • it reaches 2 through 1 and stops there

The new test AMQPSquareMirroringTest.java is testing this exact setup.

@lavocatt
Copy link
Author

lavocatt commented Sep 9, 2024

@tabish121 Thanks for the review. FYI this is in a very drafty state ATM. Many refactors might come after that, in any case I'll make sure to take your notes in.

@lavocatt
Copy link
Author

Thanks @gemmellr for the review! There are still things that are not working with the current PR and some refactors in the added lines might happen. In any case I'll make sure to take your comments in.

@lavocatt
Copy link
Author

I've exposed the new no-message-forwarding parameter (with @gemmellr's help) , now I need to test that the parameter works in a test.

@@ -534,6 +549,9 @@ private boolean sendMessage(Message message, DeliveryAnnotations deliveryAnnotat

message.setBrokerProperty(INTERNAL_ID_EXTRA_PROPERTY, internalID);
message.setBrokerProperty(INTERNAL_BROKER_ID_EXTRA_PROPERTY, internalMirrorID);
if (!this.canForwardMessages) {
message.setBrokerProperty(INTERNAL_NO_FORWARD, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered if the RoutingContext could be used for the 'dont [re-]mirror this' handling. It already has functionality for doing that (see isMirrorDisabled() and related).

However it now also occurs to me, we possibly do need to mark the message like this simply because when that message is acknowledged later, we probably want to know then that it was 'no forward' now, so that we dont mirror acknowledgement for it either given we never mirrored the message itself to begin with.

@lavocatt lavocatt marked this pull request as ready for review September 24, 2024 09:00
Comment on lines +59 to +62
public void testSquare() throws Exception {
server_2 = createServer(AMQP_PORT_2, false);
server_3 = createServer(AMQP_PORT_3, false);
server_4 = createServer(AMQP_PORT_4, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also try adding some testing using the ProtonJ2 test peers, rather than all brokers, to validate and exercise the actual broker behaviour at the protocol level. It gives a better accounting of what is really going on and when it changes unexpectedly.

E.g it seems all but certain these brokers are currently forwarding around mirrored acks for noForward messages they didn't originally mirror to begin with, but this test will not notice that one way or the other. Similarly with various other potentially unexpected behaviours that might occur.

Comment on lines 590 to 593
if (isBlockedByNoForward()) {
return;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect a different check will be needed here, inspecting the specific message; this is ultimately called by the queue, which I expect usually wont have a mirror target controller set. It will also want to accommodate not sending acks for no-forward messages [that it didnt send to begin with] even across broker restarts, due to expiry, etc etc.

I expect the test-peer based testing I know you are working on will demonstrate this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented some tests and they seem to work despite the assumption we had that it would not. Maybe something is worth a double check?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your changes last week have actually already stopped it sending some of the stuff I previously thought it would and checked that it actually did.

There are other scenarios that are less clear though. For example, if you consume the message from the middle broker instead of the first, its going to mirror the ack onward, even though it didnt mirror the message originally, which is marked as no-forward.

Add a new option in the Mirror settings to prevent a broker from
propagating messages.

When working with a topology where 4 nodes are forming a square and
where each node in that square mirrors its two neighbors: a message
leaving a corner can reach the opposite corner of the square by two
different routes. This is causing the message ordering to get broken.

example:
1 <-> 2
^     ^
|     |
v     v
4 <-> 3

A message from 1 will reach 3 by 2 and 4. Message duplication checks
will prevent the message from being duplicated but won't help regarding
the order of the messages. This is because a either the route by 2 or 4
can be faster than the other, so whomever wins the race sets the message
first.

Fixing the example:
Using the new option to not forward messages coming from a link, we
break the possibilities to have two routes to reach the opposite corner.

The above example is updated as followed:
* 2 never forwards messages coming from 1
* 1 never forwards messages coming from 2
* 3 never forwards messages coming from 4
* 4 never forwards messages coming from 3

Now, when a messages leaves 1:
* it reaches 2 and stops there
* it reaches 4
* it reaches 3 through 4 and stops there

Now, when a messages leaves 2:
* it reaches 1 and stops there
* it reaches 3
* it reaches 4 through 3 and stops there

Now, when a messages leaves 3:
* it reaches 4 and stops there
* it reaches 2
* it reaches 1 through 2 and stops there

Now, when a messages leaves 4:
* it reaches 3 and stops there
* it reaches 1
* it reaches 2 through 1 and stops there

The new test AMQPSquareMirroringTest.java is testing this exact setup.
@lavocatt
Copy link
Author

lavocatt commented Nov 13, 2024

I have figured out something I don't know how to fix

On this topology:

1 - noForward -> 2 -> 3
^                |
|----------------|

producing and consuming on 1 works, 2 mirrors, 3 is left alone

producing on 1 and consuming on 2 breaks:

  • 3 is left alone, which is good
  • 1 doesn't receive the ack, which is bad

I don't know how to overcome that. Because if I decide to forward the Ack to one when I consume on two, then if the message is consumed on 1 it'll receive a duplicate Ack from 2.

@gemmellr
Copy link
Member

Yeah, if you say noForward, and dont provide a reverse mirror, you are not getting the ack. Thats basically as requested.

there is no way to fix that without making the mirroring some kind of full topology mesh handling system with 'command forwarding between nodes without acting on them'.

I'd consider that setup basically an invalid use of this already-pretty-questionable functionality. Ask for no forwarding -> get no forwarding.

@gtully
Copy link
Contributor

gtully commented Nov 13, 2024

it makes sense to have some basic checks on the mirror, but it is also fine to have restrictions on the topologies that will work with a mirror and others that won't work. The most important thing imho is that it is understandable and deterministic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants