Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unstable serialization logic for node JSON parsing #2157

Merged
merged 29 commits into from
May 19, 2022

Conversation

trueadm
Copy link
Collaborator

@trueadm trueadm commented May 12, 2022

UPDATE (co-authored by @acywatson):

This is now ready to review. The typings were a bit fiddly and still may not be perfect, but I think we're in a place now where we can review, hopefully merge, and iterate.

To restate the main goal here - we're trying to provide a flexible serialization/deserialization API and in the process, to decouple our internal state representation from the serialized storage format we provide/recommend out-of-the-box. Success means we have given the core team AND maintainers of Node subclasses in userland the flexibility to implement backwards compatible (additive) changes to node serialization "schemas".

We have two slightly different approaches for TypeScript and Flow, but the main challenge is leveraging inheritance to create a good developer experience around serialization, while also creating as much type safety as we can to prevent errors that might cause data loss or loss backwards incompatibility between two versions of a serialized editor state.

Flow

Flow looks like this: try flow.

We use type spreading and inexact types to create flexible type definitions that can be refined in the subclass definition of importJSON (the argument it receives is a union of all serialized subclass types, which can be refined in the function logic). For exportJSON, we unfortunately can't specify a return type more specific than the base class type. Otherwise, the next subclass will not be able to align with the exportJSON signature in the superclass.

This approach is not perfect, but if you play around in the example above, you can see that it provides decent soundness via the importJSON refinement, which is important for trying to prevent mistakes int the adapter logic for various versions. exportJSON is looser, and we'll recommend unit tests to verify the schema for that one (I added some for our core nodes in this PR).

TypeScript

TypeScript looks like this: TypeScript Playground

The approach here is to implement a Spread-like utility type with override functionality for collisions, similar to how it works in flow. Using TypeScript, we can be more specific about the return value of exportJSON, which actually helped me catch a few bugs in my own code in this PR. On the downside, we can't leverage disjoint unions to do refinement with TypeScript (or at least I haven't figured out how) so the version property is actually far less useful. Instead, we need to use "in" type guards to refine the types.

The next steps here are:

  • Extensively test the actual end-to-end json export/parseEditorState process (via the unstable_ editor APIs)
  • Write more unit tests to verify the schemas of the other nodes we expose in other packages.
  • Documentation on how to use these APIs safely and correctly.

Extends from #2147 and #2120.

This PR introduces two APIs on all nodes. exportJSON() and importJSON. Additionally, there are two new unstable prefixed methods that aim to replace their originals in the future, they are unstable_parseEditorState on the editor object and unstable_toJSON on the EditorState object.

Another thing that PR does, it ensure that we using the public API methods on nodes rather than the private values, when importing and exporting JSON. This should help improve consistency and remove the tendency to store data that is more of an implementation detail.

@vercel
Copy link

vercel bot commented May 12, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
lexical ✅ Ready (Inspect) Visit Preview May 19, 2022 at 4:07PM (UTC)
lexical-playground ✅ Ready (Inspect) Visit Preview May 19, 2022 at 4:07PM (UTC)

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 12, 2022
Copy link
Member

@zurfyx zurfyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some thoughts:

  1. You may want to add copy-paste to the todo list above. Ideally we want to add this new format to the list which can provide more fidelity than HTML when possible (namespaces match)
  2. I've noticed that some nodes have version, others not. Added a comment to one of these occurrences
  3. Can we have a test/example that showcases multiple versions? Ideally somewhere in the code base so that TS can attest this works as intended

global.d.ts Outdated
}
}

export type Spread<T1, T2> = {[K in Exclude<keyof T1, keyof T2>]: T1[K]} & T2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are other folks going to use Spread for their custom nodes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is probably my biggest concern as well. It's probably not great DevX to require users to have to use utility types. I wonder if we can make this just an implementation detail on some of the base classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but this is sort of a limitation of TypeScript. Their team made the explicit decision to not natively support spread (or some other) syntax for this use case and they recommend using mapped types instead:

microsoft/TypeScript#13288 (comment)

I think the concern here can be easily addressed in either of two ways:

  1. We can export and alias or abstraction over this utility type like type ExtendedSerializedNode<T1, T2>. I think it's pretty normal for users to import and use a library-specific type from that library (think React.Element or something).

  2. Users can simply copy the utility type. Not ideal, but also not that big of a deal, IMO.

I don't think it's possible to make it an implementation detail in the base classes. The problem is that if you don't allow type field overrides for subclass serialized types, you can't further extend that class because the method signature can't be matched (at least in the experimentation I have done so far).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the above comment, there's been a few more useful util functions which have been added to TypeScript than just Exclude, so I think we could do something like the below instead, using Omit, if the basic aim is to exclude the properties of the superclass if they exist on the sub class:

Omit<T1, keyof T2> & T2

Feel free to merge. We can experiment later.

{
language: string | null | undefined;
type: 'code';
version: 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need version numbers? AFAIK it was proposed by the Flow team but later you pointed out you still had problems with inheritance (and refinement?)

Copy link
Contributor

@acywatson acywatson May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I mostly solved the problems with inheritance, but the trade-off is that the argument passed to importJSON will have to be a union of all the serialized types of the superclasses in order to conform to the method signature on the superclass in Flow. See the "try flow" link in the PR description for an example.

As far as refinement goes, versions are actually the solution rather than a problem. If you look at the "try flow" link in the playground, you can see how they form a disjoint union in Flow (this can't really be done in TS) and can be used to refine the type of the serializedNode argument in importJSON.

Version is less important in TypeScript, since you can't use a disjoint union to refine. Instead, you have to use type guards. More info in the PR description.

packages/lexical-link/src/index.js Outdated Show resolved Hide resolved
exportJSON<SerializedNode>(): SerializedLinkNode<SerializedNode> {
return {
...super.exportJSON(),
type: 'link',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably try to move towards only using type in the future

Great point! The reason why $isXNode works via inheritance right now is mostly due to Flow types (refines when checking prototypes). I'm not sure it's easy to switch now as it would be a major breaking change but we can discuss it offline to see what's the best way forward

Comment on lines 148 to 153
export type SerializedAutoLinkNode = {
...SerializedLinkNode,
type: 'autolink',
...
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the Flow approach? ...Extended + invariant? Also, no version here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing version is a mistake, but yes the Flow approach can be an invariant, an if condition or whatever other method of refinement you want to use. The point is that you have to differentiate between the two potential argument types in order to access the properties of the subclass type.

Comment on lines +110 to +121
node.setFormat(serializedNode.format);
node.setIndent(serializedNode.indent);
node.setDirection(serializedNode.direction);
return node;
}

exportJSON(): SerializedElementNode {
return {
...super.exportJSON(),
checked: this.getChecked(),
type: 'listitem',
value: this.getValue(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Opinionated] Mixing static + class is confusing:

When I write importJSON I have to think of indent but when I wrote exportJSON I do not. That makes it even harder than just having static methods, because now I have to think what I have as a part of the class and what comes as part of the prototype. And I have to understand everything anyway because the importJSON requires me to understand them all

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what you're saying, but I don't understand why it's a problem or how we might improve on this situation without introducing other trade-offs.

Comment on lines +292 to +295
const parent = this.getParent();
if (parent === null) {
return this.getLatest().__indent;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we have to revisit this in general. Getters should have getLatest by default, otherwise they can lead to subtle mistake. The basic nodes like TextNode have which is the correct behavior imo (opened an issue to revisit this - #2218)

Suggested change
const parent = this.getParent();
if (parent === null) {
return this.getLatest().__indent;
}
const self = this.getLatest();
const parent = self.getParentOrThrow();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dominic did this part - I think the idea was that the parent might not be there in serialization, and we don't want to just throw in that case, we want to return the value of __indent. Not sure your suggestion is going to fix this problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. In serialziation, there might not be a parent (if you are serializing a detached child).

return {
...super.exportJSON(),
className: this.getClassName(),
type: 'emoji',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that type must be equal to getType I wonder if there's an abstraction we can build around this -- or maybe hardcoding is fine..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I actually mentioned this to @trueadm last night in chat. I was originally thinking why not just call the static getType method on the class during serialization, so you never have a mismatch. I came up with some reason not to do that, but I can't recall what it was. I concluded that we should use an invariant instead when the exported type doesn't match the class type, but I'm open to doing it either way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. If we move to automating getType, then we'd also need to check if the class has its own exportJSON, and doesn't inherit it from the super class. Either way works.

export type SerializedLexicalNode = {
type: string,
version: number,
...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the spread? I was hoping {...Extended, something else} would do the trick but pretty sure you've played with this extensively

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I tried to make this work multiple times/ways. IIRC, the problem comes in with inheritance. You get method signature collisions when you try to further extend an exact type.

Comment on lines +1023 to +1026
parsedEditorStateFromObject.read(() => {
const root = $getRoot();
expect(root.getTextContent()).toMatch(/Hello world/);
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK as a basic test, but I would've personally tested the nodeMap size and on particular node values. You may also want to test selection is set

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a lot more tests to write here, but I personally think this is why we mark the API unstable. We have more work to do, but I think the important part is getting a working strategy for types in both TS and Flow, which I've tried to achieve in this PR.

Co-authored-by: Gerard Rovira <zurfyx@users.noreply.github.com>
@acywatson
Copy link
Contributor

  1. You may want to add copy-paste to the todo list above. Ideally we want to add this new format to the list which can provide more fidelity than HTML when possible (namespaces match)

Yea, we originally discussed actually removing that serialization format altogether for copy and paste and just relying on HTML. However, I think the use case of copy and pasting into the same editor (e.g., you want to change the order of sections in a document you're composing) is probably common enough to warrant keeping this format.

  1. I've noticed that some nodes have version, others not. Added a comment to one of these occurrences

Fixed.

  1. Can we have a test/example that showcases multiple versions? Ideally somewhere in the code base so that TS can attest this works as intended

Sure - for now I would suggest you look at the TS and Flow examples linked in the PR description.

@acywatson acywatson merged commit 8f9a903 into main May 19, 2022
@acywatson acywatson mentioned this pull request May 19, 2022
6 tasks
@zurfyx zurfyx mentioned this pull request May 23, 2022
@trueadm trueadm deleted the unstable_parseEditorState branch June 9, 2022 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants