Skip to content

[fix][client] Fix producer synchronous retry handling in failPendingMessages method#29

Closed
sandeep-mst wants to merge 11 commits intomasterfrom
producer-25201
Closed

[fix][client] Fix producer synchronous retry handling in failPendingMessages method#29
sandeep-mst wants to merge 11 commits intomasterfrom
producer-25201

Conversation

@sandeep-mst
Copy link
Collaborator

Fixes apache#25201

Motivation

Fix a re-entrancy bug in ProducerImpl.failPendingMessages. While executing on the timer, failPendingMessages invokes sendComplete(ex) on pending messages, which can synchronously trigger a retry from client code. The subsequent pendingMessages.clear() then removes the newly enqueued retry operation, leaving the retry’s CompletableFuture unresolved and the client in a limbo state.

Modifications

Updated failPendingMessages to first drain the pendingMessages queue into a local list and clear the queue before iterating. This prevents re-entrant retries triggered during sendComplete from being inadvertently cleared.

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:

  • org.apache.pulsar.client.impl.ProducerImplTest#testFailPendingMessagesSyncRetry

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@sandeep-mst
Copy link
Collaborator Author

Upstream PR is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Producer synchronous retries can cause retry sendAsync future to never complete

1 participant