Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fixed] sub issue across leaf node connections - bad sub ref count #2124

Merged
merged 1 commit into from
Apr 16, 2021

Conversation

matthiashanel
Copy link
Contributor

@matthiashanel matthiashanel commented Apr 15, 2021

This was caused by not sending subs across leaf node connections in some
cases but sending unsub in all cases. This imbalance caused
subscriptions to go away too soon. (ref count was off)

Signed-off-by: Matthias Hanel mh@synadia.com

The test TestJetStreamClusterInterestRetention failed locally for me, but that happened even without changes in this PR

client.go  (5 usages found)
    closeConnection  (2 usages found) ---- added as part of PR
        4585 srv.updateRouteSubscriptionMap(acc, sub, -1) . 
        4606 srv.updateRouteSubscriptionMap(acc, esub.sub, -(esub.n))
    processSubEx  (1 usage found) ---- does not call with client.kind LEAF
        2437 srv.updateRouteSubscriptionMap(acc, sub, 1)
    processUnsub  (1 usage found) ---- does not call with client.kind LEAF
        2776 srv.updateRouteSubscriptionMap(acc, sub, -1)
    unsubscribe  (1 usage found) ---- does not call with client.kind LEAF
        2709 c.srv.updateRouteSubscriptionMap(nsub.im.acc, nsub, -1)
leafnode.go  (2 usages found)
    processLeafSub  (1 usage found)
        1610 srv.updateRouteSubscriptionMap(acc, sub, 1) ---- check already existed
    processLeafUnsub  (1 usage found)
        1667 srv.updateRouteSubscriptionMap(acc, sub, -1) ---- added as part of PR
server.go  (1 usage found)
    updateRemoteSubscription  (1 usage found)
        3368 s.updateRouteSubscriptionMap(acc, sub, delta) ---- see callers below

client.go  (3 usages found)
    addShadowSub  (1 usage found) ------- It is my understanding from the call that this won't be called for client.kind LEAF
        2562 c.srv.updateRemoteSubscription(im.acc, &nsub, 1)
    deliverMsg  (2 usages found) ------- the below invocations are only done for client.kind == CLIENT || client.kind == SYSTEM
        3015 defer srv.updateRemoteSubscription(client.acc, sub, -1)
        3023 srv.updateRemoteSubscription(client.acc, sub, -1)

This was caused by not sending subs across leaf node connections in some
cases but sending unsub in all cases. This imbalance caused
subscriptions to go away too soon. (ref count was off)

Signed-off-by: Matthias Hanel <mh@synadia.com>
@matthiashanel
Copy link
Contributor Author

I also ran my test with these traces and was not able to see the sub/unsub on raft subjects.

diff --git a/server/leafnode.go b/server/leafnode.go
index 325a7933..12de9af6 100644
--- a/server/leafnode.go
+++ b/server/leafnode.go
@@ -1509,6 +1509,8 @@ func (c *client) processLeafSub(argo []byte) (err error) {
                return nil
        }

+       srv.Warnf("processLeafSub %q", argo)
+
        // Copy so we do not reference a potentially large buffer
        arg := make([]byte, len(argo))
        copy(arg, argo)
@@ -1644,6 +1646,8 @@ func (c *client) processLeafUnsub(arg []byte) error {
        acc := c.acc
        srv := c.srv

+       srv.Warnf("processLeafUnsub %q", arg)
+
        c.mu.Lock()
        if c.isClosed() {
                c.mu.Unlock()
diff --git a/server/route.go b/server/route.go
index cc6c0f19..c9339b62 100644
--- a/server/route.go
+++ b/server/route.go
@@ -954,6 +954,8 @@ func (c *client) processRemoteUnsub(arg []byte) (err error) {
                return nil
        }

+       srv.Warnf("processRemoteUnsub %q", arg)
+
        updateGWs := false
        // We store local subs by account and subject and optionally queue name.
        // RS- will have the arg exactly as the key.
@@ -989,6 +991,8 @@ func (c *client) processRemoteSub(argo []byte, hasOrigin bool) (err error) {
                return nil
        }

+       srv.Warnf("processRemoteSub %q", argo)
+
        // Copy so we do not reference a potentially large buffer
        arg := make([]byte, len(argo))
        copy(arg, argo)

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@kozlovic kozlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matthiashanel matthiashanel merged commit b3e355c into master Apr 16, 2021
@matthiashanel matthiashanel deleted the ln-sub-ref-count branch April 16, 2021 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants