Simplified markdown parser implemented in Rust. Converts markdown into HTML.
This implementation can handle these markdown features:
- Headers
- Thematic breaks
- Lists:
- Unordered lists
- Ordered lists
- Code blocks
- Paragraphs
- Links
- Bold and italic text styles
For features described above to work, some limitations are applied:
- Text indentaion is not supported. Any line that starts with a space or a tab
will be converted to
<p>
element; - Nested lists are not supported;
- Nesting of elements in general is not well supported;
The library used for parsing in the project is pest. It is used to build a DAG based on markdown string, and later this DAG is used in construction of an HTML string. To build a DAG it uses grammar, provided in markdown.pest
.
Talking about parsing logic - there are two main steps:
- Parsing all input into block elements.
- Parsing inline elements of each block.
As we can see, looking at the grammar file, block rule is responsible for the first step, while line and line_pars are responsible for the second step.
- space: whitespaces
- newline: new line symbols
- blankline: one or more lines that are empty(only spaces allowed)
- markdown: whole file structure
- block: block element
- special_block: all blocks except paragraph
- header: header element
- header_hashtags: used to select header variant
- thematic_break: thematic break element
- list: combined rule for both lists' variants
- unordered_list: unordered list element
- ordered_list: ordered list element
- code_block: code block element
- paragraph: paragraph element
- line: captures all elements to the end of line
- line_pars: same as line, but also parses it into inline elements
- inline: inline element
- link = link element
- link_text = captures link text
- link_href = captures link href
- bold = bold element
- italic = italic element
- text = works like line, but stops when other inline elements can be captured