Skip to content

Conversation

@gatesn
Copy link
Contributor

@gatesn gatesn commented Nov 5, 2025

Experiment with new style of vtable.

Overview:

  • No more vtable macro.
  • "Encoding" struct is now combined with the vtable struct (we had an empty vtable struct before for no reason).
  • Instead of Arc, we have an Expression struct with vtable, opaque instance data, and children.
    • We may want this struct to be Arc'd also?
  • Because private data can be anything, simple expressions do not need a separate instance struct (e.g. Binary simply stores Operator enum as private data)
  • We retain benefits of vtables in that we can run pre/post logic around typed impl functions. We can also expose a smaller API surface on Expression than we do on VTable.

gatesn added 30 commits November 3, 2025 14:51
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
@gatesn gatesn added changelog/break A breaking API change experiment labels Nov 5, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 5, 2025

CodSpeed Performance Report

Merging #5191 will degrade performances by 14.62%

Comparing ngates/expr-tree (a5a563e) with develop (45ff246)

Summary

❌ 4 regressions
✅ 1321 untouched
⏩ 148 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
pack_return_dtype[1000] 459.2 µs 534.2 µs -14.04%
pack_return_dtype[100] 50.9 µs 59.6 µs -14.62%
pack_return_dtype[2000] 916 µs 1,053.6 µs -13.06%
pack_return_dtype[500] 231.8 µs 269.8 µs -14.09%

Footnotes

  1. 148 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.


/// Hash and PartialEq are implemented based on the ptr of the value function, such that the
/// internal value doesn't impact the hash of an expression tree.
struct Rhs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we choose a better name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was here before, agree it's a bit weird but don't really want to change more than I need

Comment on lines +102 to +104
pub fn data(&self) -> &Arc<dyn Any + Send + Sync> {
&self.data
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what we have need calling metadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of...? Metadata sort of implied that it only held onto things that needed to be serialized. Instance data (as in, fat pointer of vtable + data), feels like it's more generic / familiar to the vtable structure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you expect to be in data?

Comment on lines 160 to 179
/// such as `x < (y < z)` or `x LIKE "needle%"`.
pub fn stat_falsification(&self, catalog: &mut dyn StatsCatalog) -> Option<Expression> {
self.vtable.as_dyn().stat_falsification(self, catalog)
}

/// An expression for the upper non-null bound of this expression, if available.
///
/// This function returns None if there is no upper bound or it is difficult to compute.
///
/// The returned expression evaluates to null if the maximum value is unknown. In that case, you
/// _must not_ assume the array is empty _nor_ may you assume the array only contains non-null
/// values.
pub fn max(&self, catalog: &mut dyn StatsCatalog) -> Option<Expression> {
self.vtable.as_dyn().max(self, catalog)
}

/// An expression for the lower non-null bound of this expression, if available.
///
/// See [AnalysisExpr::max] for important details.
pub fn min(&self, catalog: &mut dyn StatsCatalog) -> Option<Expression> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add this to the expr, its will be a huge trait if we keep adding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that needs to be recursively called within a vtable implementation needs to be exposed on the public Expression API.

Agree these names are a bit weird / non-descriptive. Pulling them into a separate trait I don't think fixes that. Maybe stat_max() / stat_min() ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its more about having a trait for analysis stuff that is not on the expr itself

@codecov
Copy link

codecov bot commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 81.70664% with 328 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.96%. Comparing base (b157d2c) to head (a5a563e).
⚠️ Report is 5 commits behind head on develop.

Files with missing lines Patch % Lines
vortex-expr/src/exprs/dynamic.rs 1.80% 109 Missing ⚠️
vortex-expr/src/exprs/like.rs 54.76% 38 Missing ⚠️
vortex-expr/src/vtable.rs 81.44% 36 Missing ⚠️
vortex-expr/src/exprs/cast.rs 62.12% 25 Missing ⚠️
vortex-expr/src/expression.rs 86.66% 14 Missing ⚠️
vortex-expr/src/exprs/select.rs 89.24% 10 Missing ⚠️
vortex-expr/src/exprs/list_contains.rs 83.92% 9 Missing ⚠️
vortex-layout/src/layouts/row_idx/expr.rs 50.00% 9 Missing ⚠️
vortex-expr/src/exprs/is_null.rs 79.41% 7 Missing ⚠️
vortex-expr/src/exprs/pack.rs 93.75% 7 Missing ⚠️
... and 22 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gatesn gatesn enabled auto-merge (squash) November 5, 2025 12:59
@gatesn gatesn merged commit 0962924 into develop Nov 5, 2025
44 of 45 checks passed
@gatesn gatesn deleted the ngates/expr-tree branch November 5, 2025 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/break A breaking API change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants