Contributing

Thanks for the interest in contributing to this project! Next you'll find some general explanation about the project and how to run it locally.

Tree-sitter

To get more familiar with tree-sitter itself and writing tree-sitter grammars, you may want to read https://tree-sitter.github.io/tree-sitter/creating-parsers.

The grammar

Most tree-sitter grammars are written using a single grammar.js file with a declarative-like syntax.

But reStructuredText isn't a programming language with a well defined specification, it has a lot of edge cases, and a text can have a different meaning depending on the context it is located or its indentation level.

Tree-sitter is flexible enough that it lets us write some rules in C (external scanner), so for the reason above, our grammar will make heavy use of this feature.

External scanner

Tree-sitter is a LR(k) parser, so we can't backtrack. Our external scanner must share some logic while recognizing some nodes. For example, if we find a * character, we first try to see if it's a list element, then an emphasis node, then a strong node, etc.

Most of the time when something isn't a recognizable node, it is interpreted as a simple text.

The external scanner also allow us to keep some state between each parsing of a node, this is currently used to keep track of the indentation levels.

Project structure

Most of the files on the repository are auto-generated by tree-sitter, they are needed for the grammar to be compiled easily on the user's computer, so they are committed in the repository.

Some of the files that aren't auto-generated are:

grammar.js: it defines all nodes that our grammar has and its structure.
src/scanner.c: the entry point to our custom scanner, to make it easier to maintain the code that isn't auto-generated is inside the src/tree_sitter_rst/ directory.
src/tree_sitter_rst/scanner.c: it contains functions used to create/serialize/de-serialize our custom scanner, and it also has the main entry point to our custom scanner: rst_scanner_scan (AKA, the big collection of ifs).
src/tree_sitter_rst/tokens.h: defines all tokens that our external scanner recognize, they are the same that are declared in the externals attribute in our grammar.js file.
src/tree_sitter_rst/chars.c: some utility functions to recognize characters, like numbers, bullets, letters, etc.
src/tree_sitter_rst/parser.c: here are all functions that match the current text being parsed to a valid token.
test/corpus/: tests for our grammar so we are sure nothing breaks when changing stuff, you can read about the syntax at https://tree-sitter.github.io/tree-sitter/creating-parsers#command-test.
test/examples/: these are the files that docutils uses to run their tests, we parse then without checking the resulting CST, we only care if our parser errors in the process.
docs/: this directory is deployed to GitHub pages https://stsewd.dev/tree-sitter-rst/.

Developing

Requirements:

Node
A C compiler (clang is preferred)
Docker (only if you want to see your changes on the browser)

Install the requirements with:

npm install

To build the grammar:

npm run build

To run the tests:

npm run test

Note: if you changed the grammar, you need to re-build it for tests to use the new grammar.

Test the grammar by parsing a file:

npm run parse -- test.rst

Test the grammar on your browser:

npm run web

Note: if you changed the grammar, you need to rebuild it and run npm run wasm (requires docker).

Some times you may find useful to compare the output of docutils for a given RST document, since the reStructuredText specification doesn't contain/explain all edge cases.

pip install docutils
rst2html5.py test.rst out.html
xdg-open out.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRIBUTING.md

CONTRIBUTING.md

Contributing

Tree-sitter

The grammar

External scanner

Project structure

Developing

Files

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing

Tree-sitter

The grammar

External scanner

Project structure

Developing