Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework lists #364

Merged
merged 1 commit into from
Oct 14, 2018
Merged

Rework lists #364

merged 1 commit into from
Oct 14, 2018

Conversation

wooorm
Copy link
Member

@wooorm wooorm commented Oct 6, 2018

Hi folks! 👋

After some thinking about lists, such as syntax-tree/mdast-util-to-hast#23 and #349, and the changes in syntax-tree/mdast#24, I reworked the lists as they are currently handled by remark-parse and remark-stringify. Below you can see the changes, but the most importent one is that I propose to remove loose on list and on listItem, instead replacing them by spread.

This PR additionally closes GH-350, GH-332, and GH-358.

/cc Could you review this @Hamms, as you spend some time thinking about this as well, and @ikatyang, as this will affect prettier?

Spread

So, previously we had loose. Any list item that was followed by a blank line between in and its next sibling, or if it had a blank line between any of its children, would result in that list item to receive loose: true, and otherwise loose: false.
A list would have loose: true if any of its children had loose: true, and loose: false otherwise.

This was based on the loose definition of CommonMark, which means it is meant for HTML (namely, in a loose list, all <p> tags are output, whereas in a tight list they are not).

This was okay, but wasn’t really useful for either markdown (remark-stringify) or html (mdast-util-to-hast), as both need different mechanisms (remark needs to know where to place blank lines, in the list, and in the items; mdast-util-to-hast needs to know whether to output <p> tags).

In this new solution, we break from the commonmark definition, by introducing a new term:

  • Lists have spread: true if there is a blank line between any of its children (not in them), in which case remark-stringify will join all children with blank lines between them
  • Items have spread: true if there is a blank line between any of its children (not in them), in which case remark-stringify will join all children with blank lines between them

This decouples the two, which makes for a simpler AST, but it comes with the downside that mdast-util-to-hast, when stringifying an item, needs to calculate whether its parent list or any of its siblings has spread: true, to see whether it needs to generate <p> tags.

This is good though, because the tree stable. Say we injected a new item into a list, that would previously create an unstable tree (stringifying it and parsing it again, would result in a different tree).

Changes

See c600489.

remark-parse:

* Remove `loose` fields from `list`, `listItem`
* Add support for empty list-items in any non-pedantic mode
* Add `spread` to `list`, to signal that its items should have blank
  lines between them, set to true if any blank line exists between
  items
* Add `spread` to `listItem`, to signal that its children should have
  blank lines between them
* Fix bug where two lists, if they are both ordered or both not
  ordered, was seen in commonmark as one list

remark-stringify:

* Add support for missing `ordered` field
* Add support for missing `start` field
* Add support for `list.spread` (default: false)
* Add support for `listItem.spread` (default: true)
* Fix bug where two lists, if they are both ordered or both not
  ordered, and in commonmark, were not properly stringified,
  resulting in a new parse seeing them as one list
* Fix bug where a list followed by an indented code block, in
  commonmark, was not correctly stringified, resulting in a new
  parse seeing the code as content of the list
* Remove trailing white-space on empty list-items
* Add support for stringifying list-items without parent
@ikatyang
Copy link
Contributor

ikatyang commented Oct 9, 2018

This looks great! 👍

But I think we still need the old loose field (maybe added by another plugin) to identify if two ASTs are the same from the perspective of CommonMark.

@wooorm
Copy link
Member Author

wooorm commented Oct 9, 2018

@ikatyang Of course! I think we can do it as some tiny utility!

@Hamms
Copy link
Contributor

Hamms commented Oct 10, 2018

I love the idea of decoupling the two, and it makes sense that that would be necessary to best serve both sets of interests.

It does seem - if I am correctly understanding this change - that this isn't really breaking from the commonmark definition in any significant way, just changing the time at which the definition is applied. We still apply the same rules to determine whether items or lists are loose or tight, and when rendering to HTML a list's tightness will still be determined by either its own tightness or that of its children, just now that is being determined when we consume the AST rather than when we generate it. Is that correct?

If so, it's not clear to me that it's best to rename the field from loose to spread; can you explain a bit more what your thinking is there?

@wooorm
Copy link
Member Author

wooorm commented Oct 11, 2018

@Hamms Thanks for reviewing!

I love the idea of decoupling the two, and it makes sense that that would be necessary to best serve both sets of interests.

Perfect!

If so, it's not clear to me that it's best to rename the field from loose to spread; can you explain a bit more what your thinking is there?

I meant a break away from commonmark terminology, and I think if these properties do not really do what CommonMark calls “loose”, then it makes more sense to have a different word for it!

@Hamms
Copy link
Contributor

Hamms commented Oct 11, 2018

if these properties do not really do what CommonMark calls “loose”, then it makes more sense to have a different word for it!

Agreed! I'm just having a hard time convincing myself one way or the other whether or not this is still doing what commonmark calls "loose" 🙃

A list is loose if any of its constituent list items are separated by blank lines, or if any of its constituent list items directly contain two block-level elements with a blank line between them.

In the current approach say that a list node is treated as being loose if it or any of its list items is labeled with spread: true. If we stick with the label loose over spread, then we say that a list node is treated as being loose if it or any of its list items is labeled with loose: true.

I think the thing I'm not clear on is whether we want a property on a node to imply that the node is that property, or that it's labeled with that property. I can see it being confusing if a node that is labeled with loose: false can be treated the same way a a node labeled with loose: true, depending on its children. But I can also see it being confusing that the property spread is used to determine a behavior known as loose.

Of course, either way I think this rework is great! I'm interested in the terminology change because the distinction seems subtle, but that also means it will likely be fine either way.

@wooorm
Copy link
Member Author

wooorm commented Oct 13, 2018

Agreed! I'm just having a hard time convincing myself one way or the other whether or not this is still doing what commonmark calls "loose" 🙃

It’s doing different things but it could be mapped to the same looseness definition!

In the current approach say that a list node is treated as being loose if it or any of its list items is labeled with spread: true. If we stick with the label loose over spread, then we say that a list node is treated as being loose if it or any of its list items is labeled with loose: true.

Currently loose on a list is a combination of its own looseness and that of its children, potentially creating that inconsistent tree. With this change, it or it’s children could have spread, and its looseness could be calculated from that, always being consistent.

@wooorm wooorm merged commit aab3c3e into master Oct 14, 2018
@wooorm wooorm deleted the list-fields branch October 14, 2018 15:22
@wooorm wooorm changed the title Rework lists (request for comments) Rework lists Oct 14, 2018
wooorm added a commit to syntax-tree/mdast that referenced this pull request Oct 21, 2018
wooorm added a commit to syntax-tree/mdast-util-to-hast that referenced this pull request Nov 11, 2018
Related to syntax-tree/mdast#4.
Related to remarkjs/remark#349.
Related to remarkjs/remark#350.
Related to remarkjs/remark#364.

Closes GH-23.

Co-authored-by: Titus Wormer <tituswormer@gmail.com>
wooorm added a commit to syntax-tree/mdast-util-toc that referenced this pull request Nov 10, 2019
@hessj1
Copy link

hessj1 commented Dec 15, 2020

@ikatyang Of course! I think we can do it as some tiny utility!

Is there a utility to allow for loose I am having issue after upgrading from v4 to 5@latest with my lists not being parsed into list items. like it used to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

[CommonMark] false positive for list.loose caused by trailing newline with whitespaces
4 participants