Replies: 5 comments 16 replies
-
Thanks for jumping in this in the project. I really appreciate the help here. So first let me clarify (and introduce some jargon) to make sure that all the WP contributors are aligned and understand here. When you say save document, for WordPress that corresponds to the different "entities" that can be saved through the REST API using the editor. Here's a non-exhaustive list of document "types" or entities:
Most of these entities are defined here in the frontend gutenberg/packages/core-data/src/entities.js Lines 23 to 232 in 3a310c4 Please correct me if I'm wrong but my understanding here is that for each one of these entities, the proposal above requires 1- updating the database, to attach a list of YJS document to it. Some of the questions that needs answering are:
I would love as much feedback and discussion here as possible in order for us to make the right decisions as it seems that it's going to be an invasive changes regardless of the approach. cc @WordPress/gutenberg-core It seems that the work here involves a lot of backend work. It would be ideal if @dmonad can pair with an experienced WP contributor to help with all the service side changes. Anyone available to help there? |
Beta Was this translation helpful? Give feedback.
-
Welcome, @dmonad! Excited to see where this goes. Undo/redo: It'd be great to improve this regardless of real-time collaboration. We have some long-standing issues about undo management, its connection to block attributes and third-party blocks — #8119. Revisions: we have a whole roadmap for post revisions that would be great to align architecturally — #61161 Making Yjs a first-class citizen in the editor makes a lot of sense. I'm a lot more worried about server modifications. We've gone to great lengths to introduce blocks in WordPress without modifying the preexisting server representation of data. There are backwards and forwards-compatibility reasons for that, as well as ecosystem integration, long-term survival of the content, data sync issues with duplication, db size concerns, etc. There's a myriad ways in which WP content is manipulated server side by hooks and filters, sometimes overwriting source data in the process. Our own explorations have shown that pure client indeed doesn't seem very feasible, even with the server orchestrating webrtc handshakes, but I'm curious how we can reduce the server responsibility to the minimum necessary to avoid bloating the representations we need to deal with and to ensure the foundation can be widely deployed. It stands to reason that a websockets configuration, for servers capable of offering it, would provide the best concurrent experience on top of it. Also for context, some of the "sync engine" work traced here — #52593 |
Beta Was this translation helpful? Give feedback.
-
Thank you for working on this one @dmonad! I'm a bit concerned about all the potentially breaking changes that need to happen in order to make this work. Supporting multiple authors per revision (and therefore per entity) may be a breaking change for WordPress, and we'll need to devise a good plan to approach it. Will it be possible for the entities to continue having 1 author but multiple collaborators? In other words, it may make sense to introduce the concept of collaborator as a separate abstraction from the entity author. Would that make sense in the Yjs world? Also, what happens with the linear revision history once we have multiple collaborators? I'm also concerned about the backend implementation. The Heartbeat API and post locking were introduced in WordPress 3.6 because many data loss problems were caused by multiple users editing the same post simultaneously. This has historically been a limitation due to the technology used (PHP and flavors of SQL) that don't support concurrent editing. Autosaving is also reliant on these limitations, FWIW. Considering that we may still need to have a lightweight version of concurrent editing (potentially without a dedicated WebRTC server), I'm concerned about how it will work and what limitations from the user's perspective we might have to impose to account for those pre-existing limitations, and if those won't hinder the overall collaboration experience. Do you have any experience with such limitations before? Is it possible that the most basic Apache shared server with just PHP / MySQL will be limited and unable to use any collaborative features? My biggest concern is that from this proposal, it feels like Yjs will become the centerpiece, and WordPress will have to bend its database, structure, and flow around the Yjs as a single source of truth. Backward compatibility is a significant limitation - fundamental architectural change in WordPress can be challenging to achieve in a 20+-year-old ecosystem with hundreds of thousands of plugins, themes, custom setups, etc. I'm worried about the number of architectural changes we might have to make to WordPress to support Yjs as a first-class citizen and the high chances of breaking backward compatibility in the meantime. What are your thoughts or prior experience with integrating Yjs into legacy technologies where backward compatibility was required? |
Beta Was this translation helpful? Give feedback.
-
Hi Kevin, glad to have you on board, welcome! I've been stewing on this post for a while because I'm not sure exactly how best to respond or express what I want. Collaboration is an obvious need, and it's nice to see someone with your background working on the problem.
Thank you! This is the best place to discuss it, though we might also want to consider opening a Trac ticket (at https://core.trac.wordpress.org) at some point to host more discussion with a broader audience. When I read these posts, and I've read many discussions about real-time collaboration in Gutenberg, going back many years, I constantly feel like it would help to differentiate different kinds of collaboration, needs, and timelines. What are the collaborative flows that people managing content in WordPress are interested in?
It's my understanding that Yjs can probably handle all of these situations as long as everyone does everything properly all the time, including knowing that they need to learn about the new Yjs primitives and how to interact with them. Each of these, and probably more, have distinct timelines and UI needs, I think. Any project attempting to comprehensively introduce collaboration will probably benefit by clarifying above all what modes are supported, what flows are going to look like, and what scopes will be cut or limited, as well as how the inevitable failure will be resolved when interrupted. This is where I start, and as with all things WordPress, how can we build best-effort systems when we know that we don't have control over how other code will interact. This is where I've been very tempted by a differential-synchronization model, explored in the already-linked Core Trac issue.
It feels like it might make sense to be to draw a boundary between real-time collaboration within the editor, conflict management from independent agents, and long-term collaborative work. I know that I'm oversimplifying here, but it does seem like revisions have a role to play, where an editor could "start" from a given revision and be developing a diff of its changes against that revision. Upon save, the server can attempt to reconcile the changes. And maybe this would still provide the opportunity for simultaneously-opened editors to find each other and share app state more than they are sharing a distributed document. I would think that whoever hits "save" can be the legitimate author of that change.
LAMP doesn't by default, but I have long encouraged folks to strive for pseudo-realtime updates via HTTP long polling, or even simply via normal HTTP polling. There may be novel ways to harness What I love about communicating over HTTP first is that it kind of forces some helpful designs that are easy to gloss over with sockets. Beyond that, they have valuable properties for long-lived sessions, routing, and backend updates. I'm guessing you're familiar with this, but for the sake of the discussion I wanted to toss it out there. Effectively I'm asking if there's a chance we could employ something like Yjs for the highly-interactive-but-also-rarer collaborative sessions in the editor while harnessing simpler systems for longer-term conflict management. Simplenote is pretty effective with the second approach, leaning on revisions and human intervention when conflicts do arise (for example, the server cannot merge the changes, so the client must decide if it wants to modify its contents or overwrite what's been updated on the server - "lost" content remains in the version history).
This reminds me of something I shared internally (p5j4vm-2J3#comment-4930) in 2019, which is that "the more perfect our conflict-resolution is the worse our user-experience becomes." A system with zero latency and perfect merging forms eventually-consistent documents, but in the context of multiple editors with inherent and intentional conflicts the result might be what neither author prefers. For example, changing While it may not be viable to add a simple version to every document, I wonder how far a base version number plus a content hash of the content currently-known to exist at that version would go. For example, this complicated case:
In most cases the server could update and acknowledge, saying "glad to see you writing; new revision version is 7 at Now if there's a polling HTTP connection with the server, the server can also announce "Hey connected edit session: someone posted revision 6 with the following diff against revision 5 at This is all very broad and hand-wavy, and not meant to be a roadmap or plan for collaboration, but I wonder what your thoughts are about such a broad system design. How can we get more out of what we already have without adding more, or adding the least amount possible? And how can we handle the fact that know we can't control everything? Coincidentally, I just saw the paper for Eg-walker and it looks interesting from some of these angles. Maybe there's more to learn at the algorithmic level. |
Beta Was this translation helpful? Give feedback.
-
An update on my progress. TLDR: just watch the last video below. I talked to several people and I think the consense is that we can't restrict the POST API. Thanks for all your input!! I'm new to WP & PHP & Gutenberg, so there was a lot to catch up on. If I understand correctly, the argument is that there are existing plugins that manipulate posts after they are submitted via the REST API (using filters). A short overview of the problems that we need to deal with: We are basically trying to keep different sources of truth in-sync: 1) The backend representing posts in a database. 2) The clients representing posts in Yjs. We want real-time collaboration. But PHP can't really do WebSockets right now. Additionally, WP can't speak Yjs, unless we port Yjs to PHP **(there are various language ports of Yjs, but they are all based on our C/Rust implementation of Yjs which we can't use in PHP). Then we have the additional complexity that any plugin could change post-content at any point in time without letting other clients know what exactly they changed. Any write to the database is a complete overwrite, not keeping any history, which is a requirement for automatic conflict resolution. Phew… A simpler statement of the problem: Since I last posted, I've been investigating how we could make WP collaborative without disrupting the existing APIs. I'm currently working on a prototype that keeps the Yjs document in-sync with manipulations of the post content. The Yjs document and some additional data is stored in a We can't trust that all clients can communicate via WebRTC (or that a WebSocket server is set up). So our "baseline" must work only with REST requests, which all WP instances support. I prepared a "baseline" demo that shows that we can sync using only the HTTP APIs. It even syncs if content in the database is manipulated directly. The demo runs only on HTTP requests, but I set the autosave interval to 3 seconds. This demo will work on all WP installations.
autosave.webmTo show that we get actual conflict resolution when three different peers manipulate content concurrently (two browser windows, one direct db manipulation), I disabled autosave so that we only get conflict resolution when saving a post. Most importantly, this demo shows that collab editing prevents content loss through concurrent save. Instead, you will actually merge the changes. You can still use the revision history to jump to a previous revision. actual-conflict-resolution.webmNow, we can add y-webrtc into the mix to get realtime collaboration for users that are able to establish a p2p WebRTC connection (it's a progressive enhancement). For this demo, I set the autosave interval to a reasonable 10 seconds. collab-using-webrtc.webmThe encoded Yjs document - which we should retain for future sessions - is stored in a comment tag in the post content While this will definitely add some additional overhead to the stored post size because it now contains the encoded Yjs document (we can get the collab overhead down to roughly the size of the HTML content), I believe that this would be a nice non-disrupting solution for adding collaborative editing to WP. There are several details that I need to work out before making a full proposal. In the meantime, I'm looking for feedback. |
Beta Was this translation helpful? Give feedback.
-
Hello Gutenberg community 👋,
In the coming months, I want to help to make the Gutenberg editor more collaborative. Automattic is kind enough to sponsor my work.
A little about me: I'm Kevin, the author of Yjs - a popular framework for building collaborative applications. I'm also the author of several collab extensions for other popular editors like ProseMirror, Quill, and CodeMirror.
I learned from experience that collaborative editing is quite an invasive feature that touches many aspects of the application.
Ideally, I'd like to make Yjs a first-class citizen of the Gutenberg editor, which custom blocks can use to offer their own collaborative experience (e.g., a drawing app using tldraw).
I'm currently meeting with people in this space to learn more about Gutenberg.
I opened this discussion for three reasons:
WebRTC
The current collaborative implementation uses WebRTC (using y-webrtc) to connect peers directly with each other. The idea is that the backend doesn't need to understand "real-time collaborative updates", the clients can simply sync directly with each other.
I advise against this because..
Ideally, the WordPress instance would be responsible for distributing document updates (every keystroke is synced to the backend, like in Google Docs). However, the backend is currently limited to simple REST calls, and it doesn't understand Yjs documents.
I got a lot of pushback when I initially suggested that we should simply put a (nodejs) server in-between WordPress and the client for the collaborative experience. We want to offer collaborative editing to as many users as possible.
Milestone 1 - collaborative editing using the existing backend.
My first milestone is to enhance the existing backend to understand that there can be conflicts that will be resolved eventually.
We could teach the backend that every document is associated to a list of Yjs documents.
When a user "saves" a document (through autosave), it will send the Yjs document alongside the JSON representation to the backend. In case of a conflict, we might end up with multiple "conflicting" Yjs documents on the backend. The client will be able to merge these Yjs documents.
The server will publish changes to the document through the existing autosave functionality. This progressive enhance will give all users a stable (albeit slow) collaborative experience. We may still put a websocket server in-between WordPress and the client to get a faster, more scalable experience that doesn't require autosave.
Protocol
Beta Was this translation helpful? Give feedback.
All reactions