Skip to content

Add notes when using with HA clustring software (ex. pacemaker) #196

@mikecaat

Description

@mikecaat

If a user uses pg_rman, postgresql which version is 12 or higner, and pacemaker pgsql resource agent,
there is a case postgresql can't start properly. The case is following.

(1) restore with pg_rman
(2) start postgresql server and make the archive recovery done
(3) stop postgresql server
(4) start postgresql server as standby by pacemaker pgsql resource agent
(But, the postgresql can't reach consistency and can't accept connections.)

The reason is that postgresql regards old "recovery_target_timeline" value as valid in (4)
although the timeline ID was incremented in (2).

For example,

(1) restore with pg_rman
// ex. pg_rman restores with "recovery_target_timeline = 4"

(2) start postgresql server and make the recovery done
// ex. timeline id is incremented to "5"

(3) stop postgresql server

(4) start postgresql server as standby by the pacemaker pgsql resource agent
// ex. timeline id is "5", but recovery_target_timeline = "4"
// The postgresql can't find the checkpoint wal record when postgres startups because the timeline is not valid.
// So, the postgresql can't reach consistency.

To avoid the issue, users need to remove the "recovery_target_timeline" before executing (4).

In essence, the issue occurs with a combination of PITR with "recovery_target_timline" and
the pacemaker pgsql resource agent, not with pg_rman. But, it's better to add notes in pg_rman's documentation.

Reported-by: NTT COMWARE Corporation (Tatsuro Yamada)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions