Skip to content

13.0.0 (micromark)

Compare
Choose a tag to compare
@wooorm wooorm released this 14 Oct 09:13
· 160 commits to main since this release
ea4fd81

This is a giant change for remark. It replaces the 5+ year old internals with a new low-level parser: micromark. The old internals have served billions of users well over the years, but markdown has changed over that time. micromark comes with 100% CommonMark (and GFM as an extension) compliance, and (WIP) docs on parsing rules for how to tokenize markdown with a state machine. micromark, and micromark in remark, is a good base for the future.

Migration

  • Update all the remark* packages you are using in package.json
  • Did you use GFM (tables and so, used to be the default)? Now add remark-gfm
  • Try running it all. If it fails, you are probably using a syntax extension which hasn’t been updated yet, there are issues open on the issue trackers of those plugins already
  • Compare the previous output to the new output: anything wrong? See below for the changelog which might cover it. Otherwise, It’s probably because remark is now CommonMark (or GFM) compliant.
    Please compare with those projects if the new behavior is correct or not (CommonMark Dingus, or create a gist for GFM).
    Does remark not match those reference parsers? Please open an issue here!

Changes

remark-parse

remark-parse now defers its work to micromark and mdast-util-from-markdown. micromark is a new, small, complete, and CommonMark compliant low-level markdown parser. from-markdown turns its tokens into the previously (and still) used syntax tree: mdast. Extensions to remark-parse work differently: they’re a two-part act. See for example micromark-extension-footnote and mdast-util-footnote.

  • change: commonmark is no longer an option — it’s the default
  • move: gfm is no longer an option — moved to remark-gfm
  • remove: pedantic is no longer an option — this legacy and buggy flavor of markdown is no longer widely used
  • remove: blocks is no longer an options — it’s no longer suggested to change the internal list of HTML “block” tag names

remark-stringify

remark-stringify now defers its work to mdast-util-to-markdown. It’s a new and better serializer with powerful features to ensure serialized markdown represents the syntax tree (mdast), no matter what plugins do. Extensions to it work differently: see for example mdast-util-footnote.

options
  • change: commonmark is no longer an option, it’s the default
  • change: emphasis now defaults to *
  • change: bullet now defaults to *
  • move: gfm is no longer an option — moved to remark-gfm
  • move: tableCellPadding — moved to remark-gfm
  • move: tablePipeAlign — moved to remark-gfm
  • move: stringLength — moved to remark-gfm
  • remove: pedantic is no longer an option — this legacy and buggy flavor of markdown is no longer widely used
  • remove: entities is no longer an option — with CommonMark there is almost never a need to use character references, as character escapes are preferred
  • new: quote — you can now prefer single quotes (') over double quotes (") in titles

Changes to output / the tree

All of these are for CommonMark compatibility. They’re all fixes. Most of them are inconsequential to most folks.

  • notable: references (as in, links [text][id] and images ![alt][id]) are no longer present as such in the syntax tree if they don’t have a corresponding definition ([id]: example.com). The reason for this is that CommonMark requires [text *emphasis start][undefined] emphasis end* to be emphasis.
  • notable: it is no longer possible to use two blank lines between two lists or a list and indented code. CommonMark prohibits it. For a solution, use an empty comment to end lists (<!---->)
  • inconsequential: whitespace at the start and end of lines in paragraphs is now ignored
  • inconsequential: <mailto:foobarbaz> are now correctly parsed, and the scheme is part of the tree
  • inconsequential: indented code can now follow a block quote w/o blank line
  • inconsequential: trailing indented blank lines after indented code are no longer part of that code
  • inconsequential: character references and escapes are no longer present as separate text nodes
  • inconsequential: character references which HTML allows but CommonMark doesn’t, such as &copy w/o the semicolon, are no longer recognized
  • inconsequential: the indent field is no longer available on position
  • fix: multiline setext headings
  • fix: lazy lists
  • fix: attention (emphasis, strong)
  • fix: tabs
  • fix: empty alt on images is now present as an empty string
  • …plus a ton of other minor previous differences from CommonMark

Thanks

Thanks to Salesforce, Gatsby, Vercel, and Netlify, and our other backers for sponsoring the work on micromark!
To support our continued work, back us on OpenCollective!