Skip to content

feat(vet): Add output from EXPLAIN ... for queries to the CEL program environment #2489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jul 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 88 additions & 7 deletions docs/howto/vet.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ If an expression evaluates to `true`, `sqlc vet` will report an error using the

## Defining lint rules

Each lint rule's CEL expression has access to variables from your sqlc configuration and queries,
defined in the following struct.
Each lint rule's CEL expression has access to information from your sqlc configuration and queries
via variables defined in the following proto messages.

```proto
message Config
Expand Down Expand Up @@ -41,12 +41,17 @@ message Parameter
}
```

This struct will likely expand in the future to include more query information.
We may also add information returned from a running database, such as the result from
`EXPLAIN ...`.
In addition to this basic information, when you have a PostgreSQL or MySQL
[database connection configured](../reference/config.html#database)
each CEL expression has access to the output from running `EXPLAIN ...` on your query
via the `postgresql.explain` and `mysql.explain` variables.
This output is quite complex and depends on the structure of your query but sqlc attempts
to parse and provide as much information as it can. See
[Rules using `EXPLAIN ...` output](#rules-using-explain-output) for more information.

While these examples are simplistic, they give you a flavor of the types of
rules you can write.
Here are a few example rules just using the basic configuration and query information available
to the CEL expression environment. While these examples are simplistic, they give you a flavor
of the types of rules you can write.

```yaml
version: 2
Expand Down Expand Up @@ -82,6 +87,82 @@ rules:
query.cmd == "exec"
```

### Rules using `EXPLAIN ...` output

The CEL expression environment has two variables containing `EXPLAIN ...` output,
`postgresql.explain` and `mysql.explain`. `sqlc` only populates the variable associated with
your configured database engine, and only when you have a
[database connection configured](../reference/config.html#database).

For the `postgresql` engine, `sqlc` runs

```sql
EXPLAIN (ANALYZE false, VERBOSE, COSTS, SETTINGS, BUFFERS, FORMAT JSON) ...
```

where `"..."` is your query string, and parses the output into a `PostgreSQLExplain` proto message.

For the `mysql` engine, `sqlc` runs

```sql
EXPLAIN FORMAT=JSON ...
```

where `"..."` is your query string, and parses the output into a `MySQLExplain` proto message.

These proto message definitions are too long to include here, but you can find them in the `protos`
directory within the `sqlc` source tree.

The output from `EXPLAIN ...` depends on the structure of your query so it's a bit difficult
to offer generic examples. Refer to the
[PostgreSQL documentation](https://www.postgresql.org/docs/current/using-explain.html) and
[MySQL documentation](https://dev.mysql.com/doc/refman/en/explain-output.html) for more
information.

```yaml
...
rules:
- name: postgresql-query-too-costly
message: "Query cost estimate is too high"
rule: "postgresql.explain.plan.total_cost > 1.0"
- name: postgresql-no-seq-scan
message: "Query plan results in a sequential scan"
rule: "postgresql.explain.plan.node_type == 'Seq Scan'"
- name: mysql-query-too-costly
message: "Query cost estimate is too high"
rule: "has(mysql.explain.query_block.cost_info) && double(mysql.explain.query_block.cost_info.query_cost) > 2.0"
- name: mysql-must-use-primary-key
message: "Query plan doesn't use primary key"
rule: "has(mysql.explain.query_block.table.key) && mysql.explain.query_block.table.key != 'PRIMARY'"
```

When building rules that depend on `EXPLAIN ...` output, it may be helpful to see the actual JSON
returned from the database. `sqlc` will print it When you set the environment variable
`SQLCDEBUG=dumpexplain=1`. Use this environment variable together with a dummy rule to see
`EXPLAIN ...` output for all of your queries.

```yaml
version: 2
sql:
- schema: "query.sql"
queries: "query.sql"
engine: "postgresql"
gen:
go:
package: "db"
out: "db"
rules:
- debug
rules:
- name: debug
message: "Debug"
rule: has(postgresql.explain)
```

Please note that `sqlc` does not manage or migrate your database. Use your
migration tool of choice to create the necessary database tables and objects
before running `sqlc vet` with rules that depend on `EXPLAIN ...` output.

## Built-in rules

### sqlc/db-prepare
Expand Down
7 changes: 7 additions & 0 deletions docs/reference/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,13 @@ return an error.

`SQLCDEBUG=processplugins=0`

### dumpexplain

The `dumpexplain` command prints the JSON-formatted result from running
`EXPLAIN ...` on a query when a `sqlc vet` rule evaluation requires its output.

`SQLCDEBUG=dumpexplain=1`

## SQLCTMPDIR

If specified, use the given directory as the base for temporary folders. Only
Expand Down
11 changes: 10 additions & 1 deletion examples/authors/sqlc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ sql:
uri: postgresql://${PG_USER}:${PG_PASSWORD}@${PG_HOST}:${PG_PORT}/authors
rules:
- sqlc/db-prepare
- postgresql-query-too-costly
gen:
go:
package: authors
Expand All @@ -18,6 +19,7 @@ sql:
uri: root:${MYSQL_ROOT_PASSWORD}@tcp(${MYSQL_HOST}:${MYSQL_PORT})/authors?multiStatements=true&parseTime=true
rules:
- sqlc/db-prepare
# - mysql-query-too-costly
gen:
go:
package: authors
Expand All @@ -32,4 +34,11 @@ sql:
gen:
go:
package: authors
out: sqlite
out: sqlite
rules:
- name: postgresql-query-too-costly
message: "Too costly"
rule: "postgresql.explain.plan.total_cost > 300.0"
- name: mysql-query-too-costly
message: "Too costly"
rule: "has(mysql.explain.query_block.cost_info) && double(mysql.explain.query_block.cost_info.query_cost) > 2.0"
Loading