Skip to content

Commit

Permalink
clusterer: Fix sync edge-case causing SHM accumulation
Browse files Browse the repository at this point in the history
This patch fixes a specific sync sequence leading to indefinite
accumulation of BIN packets and SHM exhaustion, as follows:

1) Sync Receiver node times out the sync in sync_check_timer()
    (default timeout is 5s, by no means difficult to achieve)
2) Sync Donor continues to send SYNC packets
    (on Receiver side, cap state is updated: PROGRESSING)
3) ! Sync Donor node loses link-state mid-sync, due to the pings being
    on the same TCP connection (and delayed, way in the back)
4) Sync Donor fails & drops remaining SYNC + SYNC-END packets, as link
    is down.  Due to this, the handle_sync_end(!is_timeout) procedure
    is never ran on the Receiver, so the PROGRESSING flag is never
    removed -> indefinite SHM buffering / memory leak

Thanks to Răzvan Crainea for helping with code & troubleshooting here!

(cherry picked from commit e5b2317)
  • Loading branch information
liviuchircu committed Oct 24, 2024
1 parent 5170a0b commit 0f86eeb
Showing 1 changed file with 12 additions and 4 deletions.
16 changes: 12 additions & 4 deletions modules/clusterer/sync.c
Original file line number Diff line number Diff line change
Expand Up @@ -595,10 +595,18 @@ void handle_sync_packet(bin_packet_t *packet, int packet_type,
bin_pop_int(packet, &data_version);

lock_get(cluster->lock);
if (cap->flags & CAP_SYNC_IN_PROGRESS)
was_in_progress = 1;
/* buffer other types of packets during sync */
cap->flags |= CAP_SYNC_IN_PROGRESS;

/* if the cap's state is already OK (e.g. donor aborted sync mid-way,
* then sync_check_timer() timed out the sync back to CAP_STATE_OK),
* avoid forcing a state where repl packets queue indefinitely! */
if (!(cap->flags & CAP_STATE_OK)) {
if (cap->flags & CAP_SYNC_IN_PROGRESS)
was_in_progress = 1;

/* buffer other types of packets during sync */
cap->flags |= CAP_SYNC_IN_PROGRESS;
}

cap->last_sync_pkt = get_ticks();
lock_release(cluster->lock);

Expand Down

0 comments on commit 0f86eeb

Please sign in to comment.