Note
This is not standard-compliant according to any specification such as CommonMark or GitHub Markdown. If you're looking for something like that, please use goldmark
A simple, zero-dependency library that tokenizes raw markdown and offers serialization to JSON. Its approach is very much inspired by Rob Pike's talk on Lexical Scanning in Go found here. Expect bugs here and there, as well as some very messy code - this is a quick project I did over a few weekends!
gomd.mov
The example above involves naive file watching, and so is much slower than in practice.
go-markdown
can process markdown (2000+ lines) files in <3.5ms:
The library is split into several packages:
lexer
: Handles tokenization of raw markdown textparser
: Converts flat token stream into a hierarchical ASTserializer
: Provides JSON serialization for both tokens and AST nodescommon
: Shared functionality like type mapping between packages
Each package is designed to be used independently, so you can use just the lexer if you only need tokens, or the full pipeline for AST generation.
package main
import (
"fmt"
"log"
"github.com/jonlinkens/go-markdown/lexer"
"github.com/jonlinkens/go-markdown/parser"
"github.com/jonlinkens/go-markdown/serializer"
)
func main() {
input := []byte("# Hello\nThis is **bold** text")
l := lexer.NewLexer(string(input))
go l.Run()
var tokens []lexer.Token
for token := range l.GetTokens() {
tokens = append(tokens, token)
}
p := parser.NewParser(tokens)
doc := p.Parse()
json, err := serializer.ToJSON(doc)
if err != nil {
log.Fatalf("Error serializing AST: %v", err)
}
fmt.Println(json)
}
All markdown elements are represented as both tokens and AST nodes. Supported elements can be found in:
- Tokens:
lexer/token.go
- AST Nodes:
parser/node.go