Guaranteeing reply delivery to the same instance in KeyShared subscription without per-instance topics #24616
Replies: 1 comment 1 reply
-
In the first place, I would avoid request-response over Pulsar since it could lead to a very tightly coupled solution between the applications. For tightly coupled applications, it might be better to use plain HTTP or gRPC interfaces to reduce unnecessary complexity. What is the exact requirement to use request-response over Pulsar? For a RPC use case over Pulsar, there's an example at https://github.com/streamnative/pulsar-recipes/tree/main/rpc .
Why do you want to avoid this? Pulsar supports a large number of topic and subscriptions. Pulsar features such as namespace policies for TTL and automatically deleting inactive topics could address the possible concerns around maintenance.
What is the reason for avoiding duplication? Are the messages large? Just thinking that data pipelines usually do a lot of duplication with intermediate topics where messages could be transformed and/or routed. This simplifies solutions in many cases.
Auto-split wouldn't provide a guarantee that you are looking for. You'd need to use sticky with Key_shared. An alternative for key_shared subscriptions could be partitioned topics using exclusive or failover subscriptions (there's answers in #24044 about that). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I’m trying to implement a request/response pattern between two programs using Pulsar, but I want to avoid:
Idea
Program A may have multiple instances.
key_1
,key_2
, …) and consumes only messages routed to it by message key hash range.reply_id = key_1
to indicate where to send the response.Program B also has multiple instances.
key = key_1
.Problem
If Program B publishes with
key = key_1
, can we guarantee that the message will be routed back to the same instance of Program A that originally sent the request?Assumptions:
key_1
maps to the same consumer)Reasoning
And the other problem is we can't exactly know which instance is being restarted, but that i may be wrong, I do not exactly know how they are being restarted.
Edit:
I’ve implemented STICKY consumers with a very narrow range. For example, I calculate:
Then I assign the range as
{range, range}
, ensuring that messages with the same key always land on the same consumer.However, as the number of program instances increases, we run into the birthday paradox: the probability of collisions grows rapidly. When that happens, the resulting bugs will be extremely difficult to diagnose.
TLDR
The question is whether the KeyShared hash-range mapping is stable enough between producer and consumer keys to make this safe, or whether we still need per-instance topics/subscriptions for strict targeting. Or is there a better/suggested way?
Beta Was this translation helpful? Give feedback.
All reactions