Skip to content

Commit 2cf120e

Browse files
committed
Fix init of WAL page header at startup
If the primary is started at an LSN within the first of a 16 MB WAL segment, the "long XLOG page header" at the beginning of the segment was not initialized correctly. That has gone unnnoticed, because under normal circumstances, nothing looks at the page header. The WAL that is streamed to the safekeepers starts at the new record's LSN, not at the beginning of the page, so that bogus page header didn't propagate elsewhere, and a primary server doesn't normally read the WAL its written. Which is good because the contents of the page would be bogus anyway, as it wouldn't contain any of the records before the LSN where the new record is written. Except that in the following cases a primary does read its own WAL: 1. When there are two-phase transactions in prepared state at checkpoint. The checkpointer reads the two-phase state from the XLOG_XACT_PREPARE record, and writes it to a file in pg_twophase/. 2. Logical decoding reads the WAL starting from the replication slot's restart LSN. This PR fixes the problem with two-phase transactions. For that, it's sufficient to initialize the page header correctly. The checkpointer only needs to read XLOG_XACT_PREPARE records that were generated after the server startup, so it's still OK that older WAL is missing / bogus. I have not investigated if we have a problem with logical decoding, however. Let's deal with that separately.
1 parent dadd6fe commit 2cf120e

File tree

1 file changed

+19
-7
lines changed

1 file changed

+19
-7
lines changed

src/backend/access/transam/xlogrecovery.c

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1691,25 +1691,37 @@ FinishWalRecovery(void)
16911691
}
16921692
else
16931693
{
1694-
int offs = endOfLog % XLOG_BLCKSZ;
1695-
char *page = palloc0(offs);
1696-
XLogRecPtr pageBeginPtr = endOfLog - offs;
1697-
int lastPageSize = ((pageBeginPtr % wal_segment_size) == 0) ? SizeOfXLogLongPHD : SizeOfXLogShortPHD;
1698-
1699-
XLogPageHeader xlogPageHdr = (XLogPageHeader) (page);
1694+
int offs = endOfLog % XLOG_BLCKSZ;
1695+
XLogRecPtr pageBeginPtr = endOfLog - offs;
1696+
bool isLongHeader = (pageBeginPtr % wal_segment_size) == 0;
1697+
int lastPageSize = isLongHeader ? SizeOfXLogLongPHD : SizeOfXLogShortPHD;
1698+
char *page = palloc0(offs);
1699+
XLogPageHeader xlogPageHdr = (XLogPageHeader) page;
17001700

17011701
xlogPageHdr->xlp_pageaddr = pageBeginPtr;
17021702
xlogPageHdr->xlp_magic = XLOG_PAGE_MAGIC;
17031703
xlogPageHdr->xlp_tli = recoveryTargetTLI;
1704+
xlogPageHdr->xlp_info = 0;
17041705
/*
17051706
* If we start writing with offset from page beginning, pretend in
17061707
* page header there is a record ending where actual data will
17071708
* start.
17081709
*/
17091710
xlogPageHdr->xlp_rem_len = offs - lastPageSize;
1710-
xlogPageHdr->xlp_info = (xlogPageHdr->xlp_rem_len > 0) ? XLP_FIRST_IS_CONTRECORD : 0;
1711+
if (xlogPageHdr->xlp_rem_len > 0)
1712+
xlogPageHdr->xlp_info |= XLP_FIRST_IS_CONTRECORD;
17111713
readOff = XLogSegmentOffset(pageBeginPtr, wal_segment_size);
17121714

1715+
if (isLongHeader)
1716+
{
1717+
XLogLongPageHeader longHdr = (XLogLongPageHeader) page;
1718+
1719+
longHdr->xlp_sysid = GetSystemIdentifier();
1720+
longHdr->xlp_seg_size = wal_segment_size;
1721+
longHdr->xlp_xlog_blcksz = XLOG_BLCKSZ;
1722+
1723+
xlogPageHdr->xlp_info |= XLP_LONG_HEADER;
1724+
}
17131725
result->lastPageBeginPtr = pageBeginPtr;
17141726
result->lastPage = page;
17151727
elog(LOG, "Continue writing WAL at %X/%X", LSN_FORMAT_ARGS(xlogreader->EndRecPtr));

0 commit comments

Comments
 (0)