-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect msgBacklog value #5202
Comments
Seems like a bug which i see different subscription with same mark delete position but different delete count. @shustsud If you are interested in fixing it, welcome! |
@codelipenghui Add comments to the ticket if there is additional information. |
I have this issue as well... one of my subscriptions to a partitioned topic is showing a backlog of 1 message, but that is incorrect -- the backlog is actually 0. Partitioned stats:
and internal stats:
|
We are suffering from same problem here, msgBacklog is 40k+ but real msgBacklog is 0. It is annoying cause it'll trigger alert but there isn't any real backlog. Stats with {
"msgRateIn": 0.0,
"msgThroughputIn": 0.0,
"msgRateOut": 0.0,
"msgThroughputOut": 0.0,
"bytesInCounter": 335230277,
"msgInCounter": 1584209,
"bytesOutCounter": 177108372,
"msgOutCounter": 789829,
"averageMsgSize": 0.0,
"msgChunkPublished": false,
"storageSize": 305110652,
"backlogSize": 296556524,
"offloadedStorageSize": 0,
"lastOffloadLedgerId": 0,
"lastOffloadSuccessTimeStamp": 0,
"lastOffloadFailureTimeStamp": 0,
"publishers": [ ],
"waitingPublishers": 0,
"subscriptions": {
"sub0": {
"msgRateOut": 0.0,
"msgThroughputOut": 0.0,
"bytesOutCounter": 3064627,
"msgOutCounter": 16150,
"msgRateRedeliver": 0.0,
"chunkedMessageRate": 0,
"msgBacklog": 44708,
"backlogSize": 0,
"msgBacklogNoDelayed": 44708,
"blockedSubscriptionOnUnackedMsgs": false,
"msgDelayed": 0,
"unackedMessages": 0,
"type": "Shared",
"msgRateExpired": 0.0,
"totalMsgExpired": 0,
"lastExpireTimestamp": 0,
"lastConsumedFlowTimestamp": 1652265631373,
"lastConsumedTimestamp": 0,
"lastAckedTimestamp": 0,
"lastMarkDeleteAdvancedTimestamp": 1652265631403,
"consumers": [],
"isDurable": true,
"isReplicated": false,
"consumersAfterMarkDeletePosition": {},
"nonContiguousDeletedMessagesRanges": 0,
"nonContiguousDeletedMessagesRangesSerializedSize": 60,
"durable": true,
"replicated": false
},
"sub1": {
"msgRateOut": 0.0,
"msgThroughputOut": 0.0,
"bytesOutCounter": 171894114,
"msgOutCounter": 761896,
"msgRateRedeliver": 0.0,
"chunkedMessageRate": 0,
"msgBacklog": 0,
"backlogSize": 0,
"msgBacklogNoDelayed": 0,
"blockedSubscriptionOnUnackedMsgs": false,
"msgDelayed": 0,
"unackedMessages": 0,
"type": "Shared",
"msgRateExpired": 0.0,
"totalMsgExpired": 0,
"lastExpireTimestamp": 0,
"lastConsumedFlowTimestamp": 1652336505371,
"lastConsumedTimestamp": 1652336498413,
"lastAckedTimestamp": 1652336512153,
"lastMarkDeleteAdvancedTimestamp": 1652336512153,
"consumers": [ ],
"isDurable": true,
"isReplicated": false,
"consumersAfterMarkDeletePosition": {},
"nonContiguousDeletedMessagesRanges": 0,
"nonContiguousDeletedMessagesRangesSerializedSize": 22,
"durable": true,
"replicated": false
}
},
"replication": {},
"deduplicationStatus": "Disabled",
"nonContiguousDeletedMessagesRanges": 0,
"nonContiguousDeletedMessagesRangesSerializedSize": 82,
"compaction": {
"lastCompactionRemovedEventCount": 0,
"lastCompactionSucceedTimestamp": 0,
"lastCompactionFailedTimestamp": 0,
"lastCompactionDurationTimeInMills": 0
}
} internal-stats {
"waitingCursorsCount": 1,
"pendingAddEntriesCount": 0,
"entriesAddedCounter": 1584209,
"numberOfEntries": 1413448,
"totalSize": 305110652,
"currentLedgerEntries": 0,
"currentLedgerSize": 0,
"lastLedgerCreatedTimestamp": "2022-05-12T18:04:59.17+08:00",
"lastLedgerCreationFailureTimestamp": "",
"lastConfirmedEntry": "114272945:43409",
"state": "LedgerOpened",
"ledgers": [
{
"ledgerId": 113999206,
"entries": 50000,
"size": 8831300,
"offloaded": false
},
...
{
"ledgerId": 114327876,
"entries": 0,
"size": 0,
"offloaded": false
}
],
"cursors": {
"sub0": {
"markDeletePosition": "114272945:43409",
"readPosition": "114272945:43410",
"waitingReadOp": false,
"pendingReadOps": 0,
"messagesConsumedCounter": 1539501,
"cursorLedger": 114594694,
"cursorLedgerLastEntry": 0,
"individuallyDeletedMessages": "[]",
"lastLedgerWitchTimestamp": "",
"state": "Open",
"numberOfEntriesSinceFirstNotAckedMessage": 1,
"totalNonContiguousDeletedMessagesRange": 0,
"properties": {}
},
"sub1": {
"markDeletePosition": "114272945:43409",
"readPosition": "114272945:43410",
"waitingReadOp": true,
"pendingReadOps": 0,
"messagesConsumedCounter": 1584421,
"cursorLedger": 114275529,
"cursorLedgerLastEntry": 16661,
"individuallyDeletedMessages": "[]",
"lastLedgerWitchTimestamp": "",
"state": "Open",
"numberOfEntriesSinceFirstNotAckedMessage": 1,
"totalNonContiguousDeletedMessagesRange": 0,
"properties": {}
}
}
} |
Possible fixed in #19275 |
Describe the bug
Despite there are no messages in the backlog, positive value was set to "msgBacklog" in the response of partitioned-stats(API).
I think "msgBacklog" is calculated from "ManagedLedgerImpl.ENTRIES_ADDED_COUNTER_UPDATER" and "ManagedCursorImpl.messagesConsumedCounter".
pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java
Line 736 in dc7d01e
Looking at the debug log, [subscriptionName1] appears to have the wrong "ManagedCursorImpl.messagesConsumedCounter" value.
In contrast, [subscriptionName2] looks correct.
Log:
Stats:
Stats-internal:
To Reproduce
We suspect that reset-cursor operations which were executed right before caused the problem.
The log at the time of operations is as follows.
Log:
The partitioned topic has 16 partitions, but only one partition has a problem.
Pulsar version
broker: 2.3.2
The text was updated successfully, but these errors were encountered: