This project was mostly AI-generated and is experimental.
An experimental tree-sitter grammar for Dataform SQLX files.
- Parses Dataform block structure:
config,js,pre_operations,post_operations - Handles
${ }interpolations within SQL - Syntax highlighting for BigQuery SQL keywords, types, and functions
- Optional JavaScript injection for
jsblocks (requirestree-sitter-javascript)
Clone the repository and build:
git clone https://github.com/renzepost/tree-sitter-dataform.git
cd tree-sitter-dataform
npm install
npx tree-sitter generatetree-sitter highlight example.sqlxFor full JavaScript syntax highlighting inside js { } blocks, install tree-sitter-javascript in one of your parser directories:
cd ~/tree-sitter-config # or another directory in your tree-sitter config
git clone https://github.com/tree-sitter/tree-sitter-javascript.gitWithout tree-sitter-javascript, JS blocks will still parse correctly but won't have syntax highlighting.
config {
type: "table",
name: "my_table"
}
js {
const columns = ["a", "b", "c"];
}
pre_operations {
DECLARE x INT64 DEFAULT 1;
}
SELECT * FROM ${ref("source")}
post_operations {
GRANT SELECT ON TABLE ${self()} TO "group:analysts@example.com"
}
All Dataform interpolations are supported:
${ref("dataset", "table")}${resolve("dataset", "table")}${self()}${name()}${when(incremental(), "...", "...")}${dataform.projectConfig.vars.myVar}
- SQL is highlighted, not parsed: SQL keywords, types, and functions are highlighted using regex patterns rather than a full SQL grammar. This means syntax errors in SQL won't be detected by the parser.
- Fix unmatched braces in
jsblock breaking parsing - Inject
tree-sitter-sql-bigqueryfor SQL statements instead of regex-based keyword highlighting- Would provide full SQL parsing and syntax highlighting
- Challenge: handling
${...}interpolations gracefully - Alternative: incorporate the full SQL grammar with interpolation support
MIT