Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix incorrect blockedConsumerOnUnackedMsgs value when maxUnackedMessagesPerConsumer is 1 #23796

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

summeriiii
Copy link
Contributor

Fixes #22657

Motivation

This issue was introduced by #20990. After we call Consumer#addAndGetUnAckedMsgs to change the unackedMessages, we should check and may update blockedConsumerOnUnackedMsgs.

Modifications

  • after change the unackedMessages, call updateBlockedConsumerOnUnackedMsgs to update blockedConsumerOnUnackedMsgs's value

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Dec 31, 2024
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I have one question in a comment.

@lhotari
Copy link
Member

lhotari commented Jan 3, 2025

@summeriiii It looks like ZkSessionExpireTest.testTopicUnloadAfterSessionRebuild fails with this change. I haven't checked why, but that test has failed multiple times.

@summeriiii
Copy link
Contributor Author

/pulsarbot run-failure-checks

@summeriiii summeriiii force-pushed the fix_incorrect_blockedConsumerOnUnackedMsgs branch from 2f2589e to b97df18 Compare January 3, 2025 13:25
@lhotari
Copy link
Member

lhotari commented Jan 3, 2025

@summeriiii I was able to reproduce the failure by running the test in Docker with the scripts available at https://github.com/lhotari/pulsar-contributor-toolbox/blob/master/functions/pulsar-contributor-toolbox-functions.sh

git fetch origin pull/23796/merge
git checkout FETCH_HEAD
ptbx_run_test_in_docker -pl pulsar-broker test -Dtest=ZkSessionExpireTest

It's hard to see the reason why it would be caused by this PR.

Before the failure, I could see this output

2025-01-03T13:43:20,121 - WARN  - [Thread-161:BrokerService] - Namespace bundle for topic (persistent://public/default/tp-e6caf121-14dc-4443-9749-08435c188ca0) not served by this instance:localhost:55835. Please redo the lookup. Request is denied: namespace=public/default
2025-01-03T13:43:20,222 - WARN  - [Thread-162:BrokerService] - Namespace bundle for topic (persistent://public/default/tp-e6caf121-14dc-4443-9749-08435c188ca0) not served by this instance:localhost:55835. Please redo the lookup. Request is denied: namespace=public/default
2025-01-03T13:43:20,323 - WARN  - [Thread-163:BrokerService] - Namespace bundle for topic (persistent://public/default/tp-e6caf121-14dc-4443-9749-08435c188ca0) not served by this instance:localhost:55835. Please redo the lookup. Request is denied: namespace=public/default
2025-01-03T13:43:20,425 - WARN  - [Thread-164:BrokerService] - Namespace bundle for topic (persistent://public/default/tp-e6caf121-14dc-4443-9749-08435c188ca0) not served by this instance:localhost:55835. Please redo the lookup. Request is denied: namespace=public/default

@lhotari
Copy link
Member

lhotari commented Jan 3, 2025

previous attempts https://github.com/apache/pulsar/actions/runs/12556467839 . Downloading logs to see if it's similar as what I see locally.

@lhotari
Copy link
Member

lhotari commented Jan 3, 2025

The test has been flaky for a longer period of time: #23389

@lhotari
Copy link
Member

lhotari commented Jan 3, 2025

The same ZkSessionExpireTest.testTopicUnloadAfterSessionRebuild test fails also in branch-4.0 builds. Let's continue addressing it in #23389 .

@codecov-commenter
Copy link

codecov-commenter commented Jan 4, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.13%. Comparing base (bbc6224) to head (b97df18).
Report is 833 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23796      +/-   ##
============================================
+ Coverage     73.57%   74.13%   +0.56%     
+ Complexity    32624    31789     -835     
============================================
  Files          1877     1853      -24     
  Lines        139502   143470    +3968     
  Branches      15299    16291     +992     
============================================
+ Hits         102638   106362    +3724     
+ Misses        28908    28729     -179     
- Partials       7956     8379     +423     
Flag Coverage Δ
inttests 26.65% <100.00%> (+2.07%) ⬆️
systests 23.21% <100.00%> (-1.12%) ⬇️
unittests 73.65% <100.00%> (+0.80%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ava/org/apache/pulsar/broker/service/Consumer.java 85.16% <100.00%> (-1.51%) ⬇️

... and 1023 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] consumers stops receiving new messages due to invalid blockedConsumerOnUnackedMsgs state
3 participants