Skip to content

Commit ebfc178

Browse files
knizhniktristan957
authored andcommitted
Add --sync-safekeepers starting standalone walproposer to sync safekeepers (#439).
It is intended to solve the following problems: a) Chicken-or-the-egg one: compute postgres needs data directory with non-rel files that are downloaded from pageserver by calling basebackup@LSN. This LSN is not arbitrary, it must include all previously committed transactions and defined through consensus voting, which happens... in walproposer, a part of compute node. b) Just warranting such LSN is not enough, we must also actually commit it and make sure there is a safekeeper who knows this LSN is committed so WAL before it can be streamed to pageserver -- otherwise basebackup will hang waiting for WAL. Advancing commit_lsn without playing consensus game is impossible, so speculative 'let's just poll safekeepers, learn start LSN of future epoch and run basebackup' won't work. Currently --sync-safekeepers is considered completed when 1) at least majority of safekeepers and 2) *all* safekeepers with live connection to walproposer switch to new epoch and advance commit_lsn allowing basebackup to proceed. 2) limits availablity, but that's because currently we don't have a mechanism defining which safekeeper should stream WAL into pageserver.
1 parent c942aa3 commit ebfc178

File tree

3 files changed

+254
-60
lines changed

3 files changed

+254
-60
lines changed

src/backend/main/main.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
#include "common/username.h"
3636
#include "port/atomics.h"
3737
#include "postmaster/postmaster.h"
38+
#include "replication/walproposer.h"
3839
#include "storage/spin.h"
3940
#include "tcop/tcopprot.h"
4041
#include "utils/help_config.h"
@@ -209,6 +210,8 @@ main(int argc, char *argv[])
209210
WalRedoMain(argc, argv,
210211
NULL, /* no dbname */
211212
strdup(get_user_name_or_exit(progname))); /* does not return */
213+
else if (argc > 1 && strcmp(argv[1], "--sync-safekeepers") == 0)
214+
WalProposerSync(argc, argv);
212215
else
213216
PostmasterMain(argc, argv); /* does not return */
214217
abort(); /* should not get here */

0 commit comments

Comments
 (0)