Skip to content

Commit 56dd719

Browse files
authored
Work around pg_replication_slot_advance xmin maintenance bug. (#815)
When the Postgres function pg_replication_slot_advance is called on a slot that used to be active (using streaming replication) and is now inactive (maintained manually) then the xmin of the slot is not maintained anymore. We believe this is a bug in Postgres, and it has been reported here: https://www.postgresql.org/message-id/flat/VI1PR83MB01897A49AE5C54A82F841D8999B09%40VI1PR83MB0189.EURPRD83.prod.outlook.com To avoid situations where Postgres won't VACUUM dead rows and otherwise maintains itself correctly, when a former primary node is available again and joind a pg_auto_failover as a standby, we drop the pre-existing replication slot and create new ones (where this time xmin is NULL).
1 parent 746afb2 commit 56dd719

File tree

5 files changed

+77
-2
lines changed

5 files changed

+77
-2
lines changed

src/bin/pg_autoctl/fsm.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -381,8 +381,8 @@ KeeperFSMTransition KeeperFSM[] = {
381381
* When an old primary gets back online and reaches draining/draining, if a
382382
* failover is on-going then have it join the selection process.
383383
*/
384-
{ DRAINING_STATE, REPORT_LSN_STATE, COMMENT_DRAINING_TO_REPORT_LSN, &fsm_report_lsn },
385-
{ DEMOTED_STATE, REPORT_LSN_STATE, COMMENT_DEMOTED_TO_REPORT_LSN, &fsm_report_lsn },
384+
{ DRAINING_STATE, REPORT_LSN_STATE, COMMENT_DRAINING_TO_REPORT_LSN, &fsm_report_lsn_and_drop_replication_slots },
385+
{ DEMOTED_STATE, REPORT_LSN_STATE, COMMENT_DEMOTED_TO_REPORT_LSN, &fsm_report_lsn_and_drop_replication_slots },
386386

387387
/*
388388
* When adding a new node and there is no primary, but there are existing

src/bin/pg_autoctl/fsm.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ bool fsm_start_maintenance_on_standby(Keeper *keeper);
6464
bool fsm_restart_standby(Keeper *keeper);
6565

6666
bool fsm_report_lsn(Keeper *keeper);
67+
bool fsm_report_lsn_and_drop_replication_slots(Keeper *keeper);
6768
bool fsm_fast_forward(Keeper *keeper);
6869
bool fsm_prepare_cascade(Keeper *keeper);
6970
bool fsm_follow_new_primary(Keeper *keeper);

src/bin/pg_autoctl/fsm_transition.c

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1025,6 +1025,24 @@ fsm_rewind_or_init(Keeper *keeper)
10251025
}
10261026
}
10271027

1028+
/*
1029+
* This node is now demoted: it used to be a primary node, it's not
1030+
* anymore. The replication slots that used to be maintained by the
1031+
* streaming replication protocol are now going to be maintained "manually"
1032+
* by pg_autoctl using pg_replication_slot_advance().
1033+
*
1034+
* There is a problem in pg_replication_slot_advance() in that it only
1035+
* maintains the restart_lsn property of a replication slot, it does not
1036+
* maintain the xmin of it. When re-using the pre-existing replication
1037+
* slots, we want to have a NULL xmin, so we drop the slots, and then
1038+
* create them again.
1039+
*/
1040+
if (!primary_drop_all_replication_slots(postgres))
1041+
{
1042+
/* errors have already been logged */
1043+
return false;
1044+
}
1045+
10281046
return true;
10291047
}
10301048

@@ -1265,6 +1283,29 @@ fsm_report_lsn(Keeper *keeper)
12651283
}
12661284

12671285

1286+
/*
1287+
* fsm_report_lsn_and_drop_replication_slots is used when a former primary node
1288+
* has been demoted and gets back online during the secondary election.
1289+
*
1290+
* As Postgres pg_replication_slot_advance() function does not maintain the
1291+
* xmin property of the slot, we want to create new inactive slots now rather
1292+
* than continue using previously-active (streaming replication) slots.
1293+
*/
1294+
bool
1295+
fsm_report_lsn_and_drop_replication_slots(Keeper *keeper)
1296+
{
1297+
LocalPostgresServer *postgres = &(keeper->postgres);
1298+
1299+
if (!fsm_report_lsn(keeper))
1300+
{
1301+
/* errors have already been reported */
1302+
return false;
1303+
}
1304+
1305+
return primary_drop_all_replication_slots(postgres);
1306+
}
1307+
1308+
12681309
/*
12691310
* When the selected failover candidate does not have the latest received WAL,
12701311
* it fetches them from another standby, the first one with the most LSN

src/bin/pg_autoctl/primary_standby.c

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -511,6 +511,38 @@ primary_drop_replication_slot(LocalPostgresServer *postgres,
511511
}
512512

513513

514+
/*
515+
* primary_drop_all_replication_slots drops all the replication slots found on
516+
* a node.
517+
*
518+
* When a node has been demoted, the replication slots that used to be
519+
* maintained by the streaming replication protocol are now going to be
520+
* maintained "manually" by pg_autoctl using pg_replication_slot_advance().
521+
*
522+
* There is a problem in pg_replication_slot_advance() in that it only
523+
* maintains the restart_lsn property of a replication slot, it does not
524+
* maintain the xmin of it. When re-using the pre-existing replication slots,
525+
* we want to have a NULL xmin, so we drop the slots, and then create them
526+
* again.
527+
*/
528+
bool
529+
primary_drop_all_replication_slots(LocalPostgresServer *postgres)
530+
{
531+
NodeAddressArray otherNodesArray = { 0 };
532+
533+
log_info("Dropping replication slots (to reset their xmin)");
534+
535+
if (!postgres_replication_slot_create_and_drop(postgres, &otherNodesArray))
536+
{
537+
log_error("Failed to drop replication slots on the local Postgres "
538+
"instance, see above for details");
539+
return false;
540+
}
541+
542+
return true;
543+
}
544+
545+
514546
/*
515547
* postgres_replication_slot_create_and_drop drops the replication slots that
516548
* belong to dropped nodes on a primary server, and creates replication slots

src/bin/pg_autoctl/primary_standby.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ bool primary_create_replication_slot(LocalPostgresServer *postgres,
7171
bool primary_drop_replication_slot(LocalPostgresServer *postgres,
7272
char *replicationSlotName);
7373
bool primary_drop_replication_slots(LocalPostgresServer *postgres);
74+
bool primary_drop_all_replication_slots(LocalPostgresServer *postgres);
7475
bool primary_set_synchronous_standby_names(LocalPostgresServer *postgres);
7576
bool postgres_replication_slot_create_and_drop(LocalPostgresServer *postgres,
7677
NodeAddressArray *nodeArray);

0 commit comments

Comments
 (0)