Skip to content

Conversation

thomas-bousquet
Copy link

@thomas-bousquet thomas-bousquet commented Aug 26, 2025

Main Issue: #21759

PIP: #21752

Motivation

managedLedgerForceRecovery is a broker configuration field, that can be set dynamically but it is missing when we want it to be set statically through broker configuration and applied when the application/container starts.

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Copy link

@thomas-bousquet Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@github-actions github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Aug 26, 2025
@lhotari
Copy link
Member

lhotari commented Aug 27, 2025

# Whether to allow brokers to forcefully load topics by skipping ledger failures to avoid topic unavailability and 
# perform auto repairs of the topics.

btw. The original description of this property isn't that great. It's missing that detail that the tradeoff of using this setting is that data is lost permanently.

When there are ledger failures, there's some other underlying issue that is causing them. One potential issue that is being addressed is #24665 . My assumption is that some ledger failures, could be resolved by restarting the BookKeeper cluster or restarting Pulsar cluster when the failures are caused by invalid in-memory state.
For 24665, there's a workaround to configure Pulsar cluster's BookKeeper client to by-pass the metadata store layer and use ZooKeeper directly. An the BookKeeper side to by-pass the Pulsar metadata store layer and use ZooKeeper directly. There's isn't yet a workaround when Oxia is used.

There are also other issues that might cause ledger failures. For example, BookKeeper 4.17.2 (included in Pulsar 4.0.6 release) contained a fix for a data loss issue, apache/bookkeeper#4607.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants