Skip to content

renzepost/tree-sitter-dataform

Repository files navigation

tree-sitter-dataform

This project was mostly AI-generated and is experimental.

An experimental tree-sitter grammar for Dataform SQLX files.

Features

  • Parses Dataform block structure: config, js, pre_operations, post_operations
  • Handles ${ } interpolations within SQL
  • Syntax highlighting for BigQuery SQL keywords, types, and functions
  • Optional JavaScript injection for js blocks (requires tree-sitter-javascript)

Installation

Clone the repository and build:

git clone https://github.com/renzepost/tree-sitter-dataform.git
cd tree-sitter-dataform
npm install
npx tree-sitter generate

Usage

Syntax Highlighting

tree-sitter highlight example.sqlx

JavaScript Block Highlighting

For full JavaScript syntax highlighting inside js { } blocks, install tree-sitter-javascript in one of your parser directories:

cd ~/tree-sitter-config  # or another directory in your tree-sitter config
git clone https://github.com/tree-sitter/tree-sitter-javascript.git

Without tree-sitter-javascript, JS blocks will still parse correctly but won't have syntax highlighting.

Supported Syntax

Dataform Blocks

config {
  type: "table",
  name: "my_table"
}

js {
  const columns = ["a", "b", "c"];
}

pre_operations {
  DECLARE x INT64 DEFAULT 1;
}

SELECT * FROM ${ref("source")}

post_operations {
  GRANT SELECT ON TABLE ${self()} TO "group:analysts@example.com"
}

Interpolations

All Dataform interpolations are supported:

  • ${ref("dataset", "table")}
  • ${resolve("dataset", "table")}
  • ${self()}
  • ${name()}
  • ${when(incremental(), "...", "...")}
  • ${dataform.projectConfig.vars.myVar}

Known Limitations

  • SQL is highlighted, not parsed: SQL keywords, types, and functions are highlighted using regex patterns rather than a full SQL grammar. This means syntax errors in SQL won't be detected by the parser.

To-do

  • Fix unmatched braces in js block breaking parsing
  • Inject tree-sitter-sql-bigquery for SQL statements instead of regex-based keyword highlighting
    • Would provide full SQL parsing and syntax highlighting
    • Challenge: handling ${...} interpolations gracefully
    • Alternative: incorporate the full SQL grammar with interpolation support

License

MIT

About

tree-sitter grammar for Dataform

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published