-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
This is a follow-up on a request from @roryabraham. I have investigated the options to integrate Markdown parsing a little further, and here are my findings:
- Markdown parsing is hard. While XML or HTML markups are not context sensible, Markdown is. So implementing my own tokenizer is a no-go for a production-ready deliverable (although I might try the challenge as a hobby at some point).
- Changing the structure of the DOM would bring too many breaking changes, so using remarkjs is a no-go.
- The solution would be intermediary: use an open-source, well tested markdown tokenizer and plug that in
htmlparser2
Tokenizer class to emit a DOM.
Chosing a Tokenizer
I've forked this benchmark to add micromark which I found very well structured and solid (via remark-html), and below are my findings (Intel i7-8809G, 32GB of RAM, Nodejs 14.16.0).
Average Ops per second
Minmax parse time
Average Throughput
Conclusion
Markdown-it is the clear winner, since there is no official web assembly support in React Native. Other plus:
- Great ecosystem with many plugins and GFM support including emojis;
- Safe by default;
- Great maintenance metrics (5 open issues).
Implementation Plan
Get inspiration from MarkdownIt.Renderer
:
consume a token tree from MarkdownIt.parse
and invoke corresponding htmlparser2
callbacks while walking the tree.
I'll also need some help to assess which features you want to enable for Expensify.cash.
Package Design
I need to think of a new package design since I don't want @native-html/core
to depend directly on markdown-it.
Testing Strategy
The parser will be tested against the official commonmark-spec repository.