Skip to content

AmandaFI/Markdown-Parser

Repository files navigation

Parser and Interpreter that reads Markdown language and converts to HTML

Application Overview

This project is constitued by a parser and a interpreter that work together to read Markdown language and transform to HTML language.

Parser

The parser was implemented in TypeScript using the Parser Combinator technique that consists of combining simpler parsers to form more complex ones. In a simplified way, a parser is a function that receives a string as input and returns data structured in a specific format if the string matches certain condition, otherwise it returns an error. All parsers used were implemented from scratch (no external parser libraries were used) and a collection of tests was implemented to ensure their functioning. The tests were implemented using the Deno built-in test runner and are located in the parsersTests.test.ts file.

The Markdown parser accepts texts in markdown language as input and outputs an object similar to an Abstract Syntax Tree (AST) containing the parsed elements. In the section Markdown Parser Output the generated object structure will be detailed.

Interpreter

The interpreter, also implemented in TypeScript, accepts the object generated by the parser as input and outputs HTML language. As explained in the section Markdown Parser Output, each element present in the input object has a defined type and by using the Discriminated Union feauture of the TypeScript language, the interpreter can easily identify which action has to be performed in order to transform that element in a HTML element.

Parser Output

Markdown Syntax

The markdown language has many features and supports several ways of using the same element. To make the scope of this project viable, the main features/elements of the languge were chosen and for each element a single way of using it was implemented.

The supported elements and their respective usage syntaxes are listed below:

  • Text: Collection of characters. Example:

    This is text.

  • Italic Text: Text delimited by two asterisks. Example:

    *This will generate italic text.*

  • Bold Text: Text delimited by four asterisks. Example:

    **This will generate bold text.**

    ***This will generate bold italic text.***

  • Line break: Sequence of two spaces followed by a \n.

  • Literal Special Characters: Some characters are used to indicate specific elements in Markdown. They will be referenced as special and to use these characters in their non special way it is necessary to use the scape character '/'. Example:

    This /* will write an asterisk.

  • Paragraph: Sequence of Text elements ending with a line break.

    This is a paragraph. This is another paragraph.

  • Heading: One to six hashes followed by one or more spaces ending with Text. Example:

    ### This is a heading.

  • Unordered List: Each unordered list item has the following structure: dash + space + Text + Line break. Example:

    /- This is an unordered list item.

    Obs: So that the raw unordered list item syntax can be visualized and not instantly converted by the Markdown interpreter, a forward slash was added at the beggining. As detailed, in a real use case the forward slash MUST NOT be present on that position.

  • Ordered List: Each ordered list item has the following structure: number + dot + space + Text + Line break. Example:

    /1. This is an ordered list item.

    Obs: The same observation present in the Unordered List item applies to this case.

  • Link: The visible text has to be enclosed in square brackets and the URL, enclosed in parentheses, has to immediately follow the text. Example:

    [This is a link](url)

  • Image: Starts with an exclamation point followed by the visible text enclosed in square brackets and the image path, enclosed in parentheses, immediately following the text. Example:

    ![This is an image](path)

  • Blockquote: Starts with a greater than sign followed by at least one space. This element can contain Text, Paragraphs, Headings, Ordered and Unordered Lists. Example:

    > This is a blockquote.

Run the code

To run this project the following prerequisite is necessary:

  • Deno installed

If your system does not meet the mentioned prerequisite check the Deno official installation documentation to download the right version for your operating system.

Once you have Deno installed, clone this repository:

$ git clone https://github.com/AmandaFI/Markdown-Parser.git
$ cd Markdown-Parser

As an example, the content.txt file contains a version of this README and by running the code, as indicated bellow, a HTML version will be created in the result.html file. You can alter the content from the content.txt file to produce other documents following the Markdown syntax described in the ... section.

Open a terminal inside the Mardown-Parse folder and run the code:

$ deno run --allow-read --allow-write interpreter.ts

Run the parsers test:

$ deno test parsersTests.test.ts

TODO
IMPLEMENTAR CODE
CRIAR CONTENT.TXT
HORIZONTALNRULLER

  • Falar do arquivo de testes e dos tipos
  • Print do formato da AST

About

Parser that converts Markdown to HTML using the parser combinator technique.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published