Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-5792: Extend autoSkipNonRecoverableData to apply to missing Schemas in Bookkeeper #136

Open
@sijie

Description

Original Issue: apache#5792


Background
Pulsar uses bookkeeper as the storage mechanism for data/cursors. In #1046 a flag was added by @rdhabalia : autoSkipNonRecoverableData that allows an admin to tell the broker to disregard missing ledgers and skip ahead. This helps in certain scenarios where the ledger is gone and it is unrecoverable (or not worth recovering.)

This flag helps prevent users from getting stuck with NoSuchEntryException/NoLedgerException when the bookkeeper cluster has suffered data loss.

Is your feature request related to a problem? Please describe.
The autoSkipNonRecoverableData flag does not solve the problem where the topic has a schema attached (stored in bookkeeper) and the ledger containing that schema goes missing. In this situation clients will receive NoSuchEntryException/NoLedgerException exceptions. Even if the admin unloads the topic the problem will still continue to occur until the admin deletes the schema

There are two key problems:

If the admin has never seen this issue before they will probably spend time checking stats-internal to see if the ledger is a cursor or if it contains entries. However the referenced ledger will not appear anywhere in stats-internal.

The autoSkipNonRecoverableData ostensibly exists to prevent users from getting stuck with NoSuchEntryException/NoLedgerException when ledgers go missing but it doesn't apply to missing schemas.

Describe the solution you'd like

Modify BookkeeperSchemaStorage to check autoSkipNonRecoverableData and behave accordingly.

Add additional output to stats-internal to show schema information including the ledgers that are used for schema storage (if the default BookkeeperSchemaStorage implementation is being used.)

Describe alternatives you've considered

Rather than modify BookkeeperSchemaStorage the broker code could be modified to catch the missing ledger exception and hide it if this flag is set.

A different boolean flag could also be used to control this behavior if there is a scenario where the admin wants to skip missing schemas but not missing data (or vice-versa)

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions