Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Generic ImportJSON/ExportJSON #3931

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

GermanJablo
Copy link
Contributor

@GermanJablo GermanJablo commented Feb 17, 2023

This is the second in a series of changes I proposed in #3763 to simplify Lexical, remove code duplication, and improve DX.

In short, thanks to this PR, no node needs to define an importJSON or exportJSON method.

The most important thing (and the first thing I recommend checking) are the changes in LexicalUpdate.ts (for importJSON) and in lexicalNode.ts (for exportJSON).

Pending:

  • The mode, direction and format properties mismatch between the way they are exported and imported making the algorithm more inefficient. I have a PR in mind that would improve this. Either way, the PR as it is is now functional.
  • There are currently 5 unit tests failing because exportJSON seems to work fine, it just changes the order of the keys. My question is, is it okay if I modify those tests to compare objects instead of the stringified editorState?

@vercel
Copy link

vercel bot commented Feb 17, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
lexical ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 26, 2023 at 6:22PM (UTC)
lexical-playground ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 26, 2023 at 6:22PM (UTC)

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 17, 2023
Comment on lines 312 to 338
const serializedNode2 = JSON.parse(JSON.stringify(serializedNode));
delete serializedNode2.children;
delete serializedNode2.version;
delete serializedNode2.mode;
delete serializedNode2.direction;
delete serializedNode2.format;
let node = new registeredNode.klass();
// @ts-expect-error
node.setFormat(serializedNode.format);
if ($isTextNode(node)) {
// @ts-expect-error
node.setMode(serializedNode.mode);
}

const node = nodeClass.importJSON(serializedNode);
if (
$isElementNode(node) &&
type !== 'tablecell' &&
type !== 'tablerow' &&
type !== 'table'
) {
// @ts-expect-error
node.setDirection(serializedNode.direction);
}
const prefix = '__';
const withUnderscore = Object.fromEntries(
Object.entries(serializedNode2).map(([k, v]) => [`${prefix}${k}`, v]),
);
node = Object.assign(node, withUnderscore);
Copy link
Contributor Author

@GermanJablo GermanJablo Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most important part which replaces importJSON.

As a side note, one of the ideas I have for simplifying setters and getters is to use a generic proxy that accesses getLatest() and getWritable(), so probably the underscore __ thing may not be necessary anymore. here or anywhere.

@acywatson
Copy link
Contributor

I’m not sure this makes sense. The purpose of this API is to allow custom nodes to control deserialization behavior, which allows for runtime backwards compatibility for Lexical states in storage. With this change, you’ve taken that flexibility and control away from developers and instead created a place in the core that will have to be continuously expanded and special-cased as new core node types are added.

Comment on lines 627 to 652
exportJSON() {
let serializedNode = JSON.parse(JSON.stringify(this.getLatest()));
delete serializedNode.__first;
delete serializedNode.__last;
delete serializedNode.__size;
delete serializedNode.__parent;
delete serializedNode.__next;
delete serializedNode.__prev;
delete serializedNode.__cachedText;
delete serializedNode.__key;
delete serializedNode.__dir;
serializedNode = Object.fromEntries(
Object.entries(serializedNode).map(([k, v]) => [k.slice(2), v]),
);
if ($isElementNode(this)) {
serializedNode = {children: [], ...serializedNode};
serializedNode.direction = this.getDirection();
serializedNode.format = this.getFormatType();
serializedNode.indent = this.getIndent();
}
if ($isTextNode(this)) {
serializedNode.mode = this.getMode();
}
serializedNode.type = this.getType();
serializedNode.version = 1;
return serializedNode;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the other important part, which replaces exportJSON()

@GermanJablo
Copy link
Contributor Author

GermanJablo commented Feb 17, 2023

I've updated the PR to add exportJSON. As you can see, the new exportJSON method of LexicalNode is generic, but a node could override it by inheritance if someone wanted to.

I can do the same with importJSON.

What do you think @acywatson?

@acywatson
Copy link
Contributor

I've updated the PR to add exportJSON. As you can see, the new exportJSON method of LexicalNode is generic, but a node could override it by inheritance if someone wanted to.

I can do the same with importJSON.

What do you think @acywatson?

My recommendation would be for you to review this PR: #2157

The reason serialization works like this is not because we never thought about just serializing all the properties on the node - that was basically how it was done before. The idea was to create a type-safe abstraction between the serialized types and the internal node structures to insulate against changes to those internal structure breaking old stored Lexical states.

I appreciate the effort, but I don't think we have a lot of appetite to revisit this right now.

@GermanJablo
Copy link
Contributor Author

I have found in the last commit a way to autogenerate the types of the serialized nodes automatically. As you can see in the video, it works with inheritance.

https://www.loom.com/share/4601d7d7411c4eafbfd47d2281bade52

Maybe that solves the problem?

I appreciate the effort, but I don't think we have a lot of appetite to revisit this right now.

I believe that what I propose is viable and has enormous advantages. But if you don't find value in these changes, please feel free to close the PR. I'd rather use a fork than be a nuisance :)

Comment on lines -158 to -159
// eslint-disable-next-line @typescript-eslint/no-explicit-any
[x: string]: any;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conflicts with the solution I found for serialized node types.
I don't understand why it was done in the first instance. It doesn't seem like a good practice since it gives a false illusion of type-safety, when in fact it breaks it at the beginning of the class hierarchy.

@acywatson
Copy link
Contributor

I believe that what I propose is viable and has enormous advantages. But if you don't find value in these changes, please feel free to close the PR. I'd rather use a fork than be a nuisance :)

@egonbolton You're not being a nuisance, but I'm also perfectly happy for you to use a fork - it is open source, after all :)

Let me spend some time with your work and see if I can get comfortable with the functionality in the scenarios I care about. Specifically, I need to understand what happens when someone decides to change a node to add a new property in. How does deserialization work with old states that don't have that property? What about when a new state (containing the added property) is deserialized by an old editor? How does the type system behave when the user makes changes to the public serialization interface of a node so that they're warned about the two scenarios above and can react accordingly?

I'm not saying that your proposal doesn't have advantages (they might even be enormous ones!), but it is potentially very consequential for all of our internal systems, which means I need to spend my time verifying it, which impacts my other priorities. That's the only issue here - I'm happy to consider the idea on it's merits and we're better just for having seen the idea, whether we merge it now or not.

I have some free time this weekend, so I will try to check out the branch and verify some of these scenarios and see where I get.

delete serializedNode2.format;
node.setFormat(serializedNode.format);
if (type !== 'tablecell' && type !== 'tablerow' && type !== 'table') {
node.setDirection(serializedNode.direction);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to override these nodes' setDirection to be noop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to export the direction in those nodes right? I agree.
If that's okay with you, I'll have to modify the unit tests to include that property just like I had to do with colSpan

packages/lexical-website/docs/concepts/serialization.md Outdated Show resolved Hide resolved
'LexicalNode: Node %s does not implement .importJSON().',
this.name,
exportJSON() {
let serializedNode = JSON.parse(JSON.stringify(this.getLatest()));
Copy link
Contributor

@fantactuka fantactuka Feb 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to run it on a doc with 5k nodes, got ~35ms vs ~6ms on main branch, it's not extremely slow, but I can imagine someone to have editor.toJSON within onChange handler. Tried to replace it with looping through prefixed keys and it works better (~9ms) but it leaves more responsibility for custom nodes owners, who need to ensure their props are serializable or provide own import/export overrides. We can try dev mode invariants that will warn about non-serializable values, but won't affect production users

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent improvement, thank you very much!

Tried to replace it with looping through prefixed keys and it works better (~9ms) but it leaves more responsibility for custom nodes owners, who need to ensure their props are serializable or provide own import/export overrides.

If I'm not misunderstanding, that's true for both my implementation and the variation you've made. The purpose of this PR in principle is that it is only necessary to define serialization on non-serializable nodes.

Actually, I believe that automatic serialization could always be achieved by using something like devalue. However, I'm not sure what the implications of this are on performance or memory. Taking caption for example, I don't know if it would be good to serialize all editor properties like pendingEditorState and such. Maybe we can open a separate thread to discuss this separately?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I believe that automatic serialization could always be achieved by using something like devalue. However, I'm not sure what the implications of this are on performance or memory.

  1. Automatic serialization is not necessarily what we want - more like flexible serialization with sane defaults.

  2. Lexical core has no dependencies on any other library - we don't want to change that. Control over performance is one of the main reasons that was done.

Taking caption for example, I don't know if it would be good to serialize all editor properties like pendingEditorState and such.

I don't think it would be good. In the best case it's unnecessary bloat in editor states, in the worst case it encourages depending on implementation details that might change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I thought @acywatson

@fantactuka, did you get it to work with your snippet?

even setting direction, indent and version there are some tests that break when I implement it.

Copy link
Contributor

@fantactuka fantactuka Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fantactuka, did you get it to work with your snippet?

No, I didn't get to the point to run testes over it, can take a look closer tomorrow evening

Lexical core has no dependencies on any other library - we don't want to change that.

Also adds 2.9kb to core

We could limit auto-serialization to primitives (numbers, strings, bools, nulls) and nested editors (by just calling .toJSON on it). Everything else should require custom import/export. It should be possible to lint or dev-mode-warn (class with no import/exportJSON and non-primitive props starting with "__"). I might be wrong, but I can only recall MarkNode and its sub-classes that use arrays/maps as serializable props

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. stale-pr PRs that are closed due to staleness. Please reopen the PR if there are new updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants