Skip to content

[Real-time collaboration] Fix race condition in compaction#75835

Draft
maxschmeling wants to merge 3 commits intotrunkfrom
fix/sync-provider-race-condition
Draft

[Real-time collaboration] Fix race condition in compaction#75835
maxschmeling wants to merge 3 commits intotrunkfrom
fix/sync-provider-race-condition

Conversation

@maxschmeling
Copy link
Contributor

@maxschmeling maxschmeling commented Feb 23, 2026

What?

This fixes a race condition with the default sync provider compaction.

Why?

The original approach of deleting all post meta and adding back newer updates resulted in a couple different potential race conditions where messages could get completely lost or incorrectly show no messages by mistake.

How?

This change just switches from a "delete all then re-add" strategy to a "targeted deletion" approach. With post meta we can't delete in one DELETE query, so we have to call delete_post_meta for every entry. This is not efficient, but is much safer.

Testing Instructions

  1. Enable Real-time Collaboration
  2. Test long enough to get compaction (50+ messages)
  3. Verify it's still working correctly

Testing the race condition directly is more difficult.

@maxschmeling maxschmeling added [Type] Bug An existing feature does not function as intended [Feature] Real-time Collaboration Phase 3 of the Gutenberg roadmap around real-time collaboration labels Feb 23, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a race condition in the real-time collaboration sync provider's compaction mechanism. The original "delete-all-then-re-add" approach had a critical race condition window where concurrent operations could lose messages or incorrectly report no messages.

Changes:

  • Switched from deleting all updates and re-adding newer ones to a targeted deletion approach that only removes updates older than the cursor
  • Eliminated the race condition by avoiding the complete deletion of all sync updates

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* This avoids removing all, and ensures that only the targeted updates are deleted.
*/
if ( ! delete_post_meta( $post_id, self::SYNC_UPDATE_META_KEY, $envelope ) ) {
// stop deleting and return false if an update fails to delete
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should start with a capital letter to match the style convention used throughout this file. All other inline comments in this file begin with a capital letter.

Suggested change
// stop deleting and return false if an update fails to delete
// Stop deleting and return false if an update fails to delete

Copilot uses AI. Check for mistakes.
foreach ( $all_updates as $envelope ) {
if ( $add_result && $envelope['timestamp'] >= $cursor ) {
$add_result = (bool) add_post_meta( $post_id, self::SYNC_UPDATE_META_KEY, $envelope, false );
if ( isset( $envelope['timestamp'] ) && $envelope['timestamp'] < $cursor ) {
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for isset($envelope['timestamp']) is redundant. The get_all_updates() method called at line 308 already filters out any updates that don't have a timestamp field (see lines 116-121), ensuring all returned envelopes have both 'timestamp' and 'value' keys with timestamp being an integer. This check can be simplified to just $envelope['timestamp'] < $cursor.

Suggested change
if ( isset( $envelope['timestamp'] ) && $envelope['timestamp'] < $cursor ) {
if ( $envelope['timestamp'] < $cursor ) {

Copilot uses AI. Check for mistakes.
* Use delete_post_meta with serialized value to ensure only this specific update is removed.
* This avoids removing all, and ensures that only the targeted updates are deleted.
*/
if ( ! delete_post_meta( $post_id, self::SYNC_UPDATE_META_KEY, $envelope ) ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing $meta_value results in a meta_value query, which is not performant since that column is not indexed. (On some hosts the first n characters are indexed, but these values will likely exceed reasonable values of n.)

Perhaps we do need to collect and pass meta_ids.

https://github.com/WordPress/wordpress-develop/blob/7924b71d7a2392a1ff27e09cdea6d4be809a4a34/src/wp-includes/meta.php#L463

}

return $add_result;
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not failing if delete_post_meta fails

@mindctrl
Copy link

Nice, I found this issue separately and started working on an idea to build on top of @westonruter PR here.

The idea is to use meta_id for ordering, and scope delete requests to the rows lower than the max meta_id was at the time of the fetch. I have a branch in progress here that shows the concept: https://github.com/WordPress/wordpress-develop/compare/trunk...mindctrl:wordpress-develop:try/post-meta-refactor?expand=1.

There's also a race condition in get_updates_after_cursor() where it calls get_all_updates(), which I think my branch also fixes. Curious to hear opinions on the idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Real-time Collaboration Phase 3 of the Gutenberg roadmap around real-time collaboration [Type] Bug An existing feature does not function as intended

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants