BE-306: HashQL: PostgreSQL translation#8526
BE-306: HashQL: PostgreSQL translation#8526indietyp wants to merge 7 commits intobm/be-457-hashql-mir-execution-pipeline-extensions-for-postgresfrom
Conversation
feat: checkpoint (II) feat: checkpoint (III) feat: snapshot vec feat: add dedicated filter feat: checkpoint feat: filter implementation feat: filter implementation (mostly) done chore: environment capture note chore: always postgres bigint feat: target clone feat: simplify lookup feat: move storage up feat: eval entity path chore: checkpoint chore: checkpoint chore: find entrypoint feat: eval context feat: eval cleanup chore: cleanup feat: track index feat: wire up filter feat: add error reporting chore: checkpoint feat: add traverse, and first postgres compiler outline feat: traverse bitmap feat: move traversal out feat: projections feat: projections fix: clippy feat: subquery projection for lateral feat: checkpoint feat: test plan feat: checkpoint feat: checkpoint – failing tests ;-; feat: checkpoint – failing tests ;-; feat: checkpoint — passing tests fix: import fix: entity type feat: checkpoint feat: attribute a cost to terminator placement switches fix: import feat: checkpoint feat: checkpoint chore: lint
PR SummaryHigh Risk Overview Adds new evaluation infrastructure ( Also extends the Postgres query AST with a Written by Cursor Bugbot for commit 1024ba5. This will update automatically on new commits. Configure here. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
🤖 Augment PR SummarySummary: Adds a PostgreSQL compilation backend for HashQL by lowering MIR execution islands into SQL Key changes:
Technical notes: Compiled filter islands are materialized with 🤖 Was this summary useful? React with 👍 or 👎 |
33a9dcc to
8c07a05
Compare
5d95ba7 to
1aa0f1c
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## bm/be-457-hashql-mir-execution-pipeline-extensions-for-postgres #8526 +/- ##
===================================================================================================
- Coverage 72.02% 63.94% -8.09%
===================================================================================================
Files 785 1195 +410
Lines 71140 130697 +59557
Branches 3868 5005 +1137
===================================================================================================
+ Hits 51242 83570 +32328
- Misses 19392 46260 +26868
- Partials 506 867 +361 Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ad5227a to
7736667
Compare
| @@ -0,0 +1,7 @@ | |||
| --- | |||
| source: libs/@local/hashql/eval/src/postgres/filter/tests.rs | |||
| expression: report.to_string() | |||
There was a problem hiding this comment.
Could we please change this expression to the actual query? Then it's far easier to review the tests.
5037fe8 to
c015a34
Compare
db87959 to
8d113b2
Compare
Benchmark results
|
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2002 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1001 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 3314 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 1526 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 2078 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 1033 | Flame Graph |
policy_resolution_medium
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 102 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 51 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 269 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 107 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 133 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 63 | Flame Graph |
policy_resolution_none
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 8 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 3 | Flame Graph |
policy_resolution_small
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 52 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 25 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 94 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 26 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 66 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 29 | Flame Graph |
read_scaling_complete
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id;one_depth | 1 entities | Flame Graph | |
| entity_by_id;one_depth | 10 entities | Flame Graph | |
| entity_by_id;one_depth | 25 entities | Flame Graph | |
| entity_by_id;one_depth | 5 entities | Flame Graph | |
| entity_by_id;one_depth | 50 entities | Flame Graph | |
| entity_by_id;two_depth | 1 entities | Flame Graph | |
| entity_by_id;two_depth | 10 entities | Flame Graph | |
| entity_by_id;two_depth | 25 entities | Flame Graph | |
| entity_by_id;two_depth | 5 entities | Flame Graph | |
| entity_by_id;two_depth | 50 entities | Flame Graph | |
| entity_by_id;zero_depth | 1 entities | Flame Graph | |
| entity_by_id;zero_depth | 10 entities | Flame Graph | |
| entity_by_id;zero_depth | 25 entities | Flame Graph | |
| entity_by_id;zero_depth | 5 entities | Flame Graph | |
| entity_by_id;zero_depth | 50 entities | Flame Graph |
read_scaling_linkless
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | 1 entities | Flame Graph | |
| entity_by_id | 10 entities | Flame Graph | |
| entity_by_id | 100 entities | Flame Graph | |
| entity_by_id | 1000 entities | Flame Graph | |
| entity_by_id | 10000 entities | Flame Graph |
representative_read_entity
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1
|
Flame Graph |
representative_read_entity_type
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| get_entity_type_by_id | Account ID: bf5a9ef5-dc3b-43cf-a291-6210c0321eba
|
Flame Graph |
representative_read_multiple_entities
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_property | traversal_paths=0 | 0 | |
| entity_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=0 | 0 | |
| link_by_source_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true |
scenarios
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| full_test | query-limited | Flame Graph | |
| full_test | query-unlimited | Flame Graph | |
| linked_queries | query-limited | Flame Graph | |
| linked_queries | query-unlimited | Flame Graph |
8d113b2 to
c67124e
Compare
c015a34 to
1024ba5
Compare
|
Deployment failed with the following error: |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 4 potential issues.
Bugbot Autofix prepared fixes for all 4 issues found in the latest run.
- ✅ Fixed: Null filters are treated as matches
- Return continuations now normalize filter expressions to reject SQL NULL so only true boolean predicate results are treated as matches.
- ✅ Fixed: Switch discriminant forced to 32-bit int
- SwitchInt comparison now widens to numeric when case values exceed 32-bit range, preventing truncation/overflow miscomparisons for wide switch targets.
- ✅ Fixed: Island exits can panic on captured environment
- Island-exit live-out serialization now skips Local::ENV so captured environment access no longer tries to read a non-materialized local slot.
- ✅ Fixed: Non-entry Postgres islands miss parameter binding
- compile_body now seeds non-start island entry parameters from incoming external edge targets before compiling the entry block.
Or push these changes by commenting:
@cursor push 98e8893a92
Preview (98e8893a92)
diff --git a/libs/@local/hashql/eval/src/postgres/filter/mod.rs b/libs/@local/hashql/eval/src/postgres/filter/mod.rs
--- a/libs/@local/hashql/eval/src/postgres/filter/mod.rs
+++ b/libs/@local/hashql/eval/src/postgres/filter/mod.rs
@@ -95,8 +95,24 @@
// (filter, block, locals, values)
let row = match continuation {
Continuation::Return { filter } => {
+ // Normalize SQL three-valued logic to two-valued filter semantics:
+ // `NULL` from predicate evaluation should behave like `FALSE`, not
+ // like continuation passthrough.
+ let filter = filter.grouped().cast(PostgresType::Boolean);
+ let filter_is_not_false = Self::Unary(UnaryExpression {
+ op: UnaryOperator::IsNotFalse,
+ expr: Box::new(filter.clone()),
+ });
+ let filter_is_not_null = Self::Unary(UnaryExpression {
+ op: UnaryOperator::Not,
+ expr: Box::new(Self::Unary(UnaryExpression {
+ op: UnaryOperator::IsNull,
+ expr: Box::new(filter),
+ })),
+ });
+
vec![
- filter.grouped().cast(PostgresType::Boolean),
+ Self::all(vec![filter_is_not_false, filter_is_not_null]),
null.clone(),
null.clone(),
null,
@@ -177,11 +193,18 @@
debug_assert_eq!(branch_results.len(), targets.values().len());
- // SwitchInt compares the discriminant against integer values. If the
- // discriminant is a boolean expression (e.g. `IS NOT NULL`), PostgreSQL
- // rejects `boolean = integer`. Casting to `::int` is safe for all types
- // and a no-op when the discriminant is already integral.
- let discriminant = Box::new(discriminant.grouped().cast(PostgresType::Int));
+ // Preserve the existing boolean-switch behavior (`::int`) for 0/1 cases,
+ // but avoid 32-bit narrowing for wider SwitchInt values.
+ let cast = if targets
+ .values()
+ .iter()
+ .all(|&value| i32::try_from(value).is_ok())
+ {
+ PostgresType::Int
+ } else {
+ PostgresType::Numeric
+ };
+ let discriminant = Box::new(discriminant.grouped().cast(cast.clone()));
let mut discriminant = Some(discriminant);
let mut conditions = Vec::with_capacity(targets.values().len());
@@ -197,7 +220,7 @@
let when = Expression::Binary(BinaryExpression {
op: BinaryOperator::Equal,
left: discriminant,
- right: Box::new(Expression::Constant(query::Constant::U128(value))),
+ right: Box::new(Expression::Constant(query::Constant::U128(value)).cast(cast.clone())),
});
conditions.push((when, then));
@@ -605,6 +628,48 @@
unreachable!("The postgres island always has an entry block (BasicBlockId::START)")
}
+ fn find_external_entry_target(
+ &self,
+ island: &IslandNode,
+ entry_block: BasicBlockId,
+ ) -> Option<(BasicBlockId, Target<'heap>)> {
+ let mut incoming = None;
+
+ for predecessor in self.body.basic_blocks.predecessors(entry_block) {
+ if island.contains(predecessor) {
+ continue;
+ }
+
+ let terminator = &self.body.basic_blocks[predecessor].terminator.kind;
+
+ if let TerminatorKind::GraphRead(read) = terminator
+ && read.target == entry_block
+ {
+ let target = Target::block(entry_block);
+
+ if let Some((_, existing)) = incoming {
+ debug_assert_eq!(existing, target);
+ } else {
+ incoming = Some((predecessor, target));
+ }
+ }
+
+ for &target in terminator.successor_targets() {
+ if target.block != entry_block {
+ continue;
+ }
+
+ if let Some((_, existing)) = incoming {
+ debug_assert_eq!(existing, target);
+ } else {
+ incoming = Some((predecessor, target));
+ }
+ }
+ }
+
+ incoming
+ }
+
fn compile_island_exit(
&mut self,
db: &mut DatabaseContext<'heap, A>,
@@ -631,6 +696,12 @@
}
for local in live_out {
+ // The environment local is immutable and available independently from
+ // the local expression map, so it does not need to be serialized here.
+ if local == Local::ENV {
+ continue;
+ }
+
let value = self
.locals
.lookup(local)
@@ -739,8 +810,17 @@
{
debug_assert_eq!(island.target(), TargetId::Postgres);
+ let entry_block = self.find_entry_block(island);
+
+ // Non-start islands may receive block arguments from predecessors outside
+ // the island. Seed entry parameters once before starting compilation.
+ if let Some((from, target)) = self.find_external_entry_target(island, entry_block) {
+ let span = self.body.basic_blocks[from].terminator.span;
+ self.assign_params(db, span, &target);
+ }
+
let mut stack = Vec::new_in(self.scratch.clone());
- stack.push(Frame::Compile(self.find_entry_block(island)));
+ stack.push(Frame::Compile(entry_block));
let mut results = Vec::new_in(self.scratch.clone());This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
| Expression::Unary(query::UnaryExpression { | ||
| op: query::UnaryOperator::IsNotFalse, | ||
| expr: Box::new(field_access(alias, ContinuationColumn::Filter)), | ||
| }) |
There was a problem hiding this comment.
Null filters are treated as matches
High Severity
filter_condition uses IS NOT FALSE, so any NULL filter value passes the row. Continuation::Return writes the compiled predicate directly into filter, and SQL predicates like comparisons on missing JSON paths can evaluate to NULL. That makes unknown predicate results behave like success, changing filter semantics and returning extra rows.
Additional Locations (1)
| // discriminant is a boolean expression (e.g. `IS NOT NULL`), PostgreSQL | ||
| // rejects `boolean = integer`. Casting to `::int` is safe for all types | ||
| // and a no-op when the discriminant is already integral. | ||
| let discriminant = Box::new(discriminant.grouped().cast(PostgresType::Int)); |
There was a problem hiding this comment.
Switch discriminant forced to 32-bit int
Medium Severity
finish_switch_int always casts the discriminant to PostgresType::Int before comparison. SwitchTargets stores branch values as u128, so non-32-bit cases (large unsigned values or signed values encoded outside 32-bit range) can be miscompared or fail at runtime, selecting the wrong branch.
| .locals | ||
| .lookup(local) | ||
| .unwrap_or_else(|| unreachable!("use before def")) | ||
| .clone(); |
There was a problem hiding this comment.
Island exits can panic on captured environment
High Severity
compile_island_exit serializes every live_out local by reading self.locals. TraversalLivenessAnalysis marks Local::ENV as live, but Local::ENV is never stored in self.locals and is handled specially in compile_place_env. When an island exits to interpreter code that still uses captures, this hits unreachable!("use before def").
|
|
||
| let mut stack = Vec::new_in(self.scratch.clone()); | ||
| stack.push(Frame::Compile(self.find_entry_block(island))); | ||
|
|
There was a problem hiding this comment.
Non-entry Postgres islands miss parameter binding
Medium Severity
compile_body starts each island by pushing Frame::Compile(entry_block) without binding incoming block arguments. For Postgres islands whose entry has predecessors outside the island, entry block params are required but never initialized, so first use can hit use before def or compile incorrect SQL from missing locals.




🌟 What is the purpose of this PR?
Implements the postgres compilation backend for HashQL. Takes the MIR control flow graph (after execution analysis has assigned basic blocks to backends and partitioned them into islands) and compiles the Postgres-assigned islands into SQL
SELECTstatements.🔍 What does this change?
Postgres compiler (
eval/src/postgres/mod.rs):Top-level entry point. Compiles a
GraphReadbody island-by-island into aPreparedQuery(aSelectStatement+ deduplicatedParameterslist). Each Postgres island becomes aCROSS JOIN LATERALsubquery returning acontinuationcomposite value. The continuation carriesfilter(keep/reject/passthrough),block(next basic block), andlocals/values(live-out data for the interpreter to resume from).Filter compiler (
eval/src/postgres/filter/):Walks the MIR basic blocks within an island and compiles each statement into SQL expressions. Uses an explicit frame stack (not recursion) to handle
SwitchIntterminators: each branch becomes aCASE WHENarm, with the discriminant cast to::intto avoid boolean/integer type mismatches in PostgreSQL. Out-of-island branches produce continuation values that encode which block to resume and what locals to carry.Projections (
eval/src/postgres/projections.rs):Maps
EntityPathvariants to SQL column references or JSONB extraction expressions. Tracks which table joins are needed and only requests them when a path is actually referenced. Handles the split between "column-backed" paths (entity_uuid, web_id, etc.) and "JSONB-backed" paths (properties, type IDs).Parameters (
eval/src/postgres/parameters.rs):Builds the
$1, $2, ...parameter list for the prepared statement. Deduplicates by identity. EachParametervariant represents a different source:Input(user-provided values),Symbol/Primitive/Int(query literals),Env(closure captures),TemporalAxis(execution context). TheCompiledQueryreturn type exposes which indices correspond to which sources so the interpreter can bind them.Continuation (
eval/src/postgres/continuation.rs):Builds the
ROW(filter, block, locals, values)::continuationcomposite values that encode island exit state. Handles the three exit cases: passthrough (NULL continuation), filter-only (just a boolean), and full exit (block + live-out locals serialized as parallel int[]/jsonb[] arrays).Traverse (
eval/src/postgres/traverse.rs):Compiles graph traversal requirements into SQL joins. Reads the island's
providesset to determine which entity paths need table joins, then requests them from the database context layer.Error infrastructure (
eval/src/postgres/error.rs):Diagnostic types for compilation errors (unsupported operations, type mismatches, missing paths) with span-accurate source locations.
Context (
eval/src/context.rs):DatabaseContexttrait and implementation that the compiler uses to request table aliases, register joins, and access the schema. Bridges between the HashQL type system and the graph-store query builder.Compiletest suite (
compiletest/src/suite/eval_postgres.rs):New compiletest suite that runs the full pipeline (parse, type-check, lower to MIR, run execution analysis, compile to SQL) and compares the output against blessed
.stdoutfiles. Also emits.aux.mirsecondary outputs showing the MIR after execution analysis for debugging.Pre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
OFFSET 0on lateral subqueries is a workaround for PostgreSQL inlining composites; seecontinuation.rsdoc comments for details.🛡 What tests cover this?
filter/tests.rs, ~1000 lines) using insta snapshots covering: straight-line blocks, branching CFGs, diamond merges, island exits, projections, property access, parameter deduplication, lateral subquery generationeval/tests/ui/postgres/covering end-to-end compilation: comparison operators, entity field access, input parameters, let bindings, if-expressions, nested branching, environment captures, list/dict/struct/tuple construction, multiple filters, mixed-source filters❓ How to test this?