perf(mutate): skip dereferencing cols preserved in .mutate #11458

JonAnCla · 2025-07-14T16:26:52Z

Description of changes

Table.select() now includes a "private" kwarg _exprs_that_do_not_need_dereferencing which is an optional dict of values that do not need "dereferencing" via Table.bind - this should be used for values that are already known to be "dereferenced" to reduce cost of that step

_exprs_that_do_not_need_dereferencing is now passed by Table.mutate and contains columns on self, so that if a wide table is mutated, existing columns are not dereferenced, and only new ones are instead.

With this change, time needed to add a column to a wide table is reduced by about 50% for a 100 column table.

100 mutates on 100 column table before:

         14967009 function calls (14945093 primitive calls) in 7.366 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.028    0.028    7.366    7.366 2926429498.py:4(build_expr)
      200    0.014    0.000    7.129    0.036 relations.py:1866(mutate)
      200    0.025    0.000    5.164    0.026 relations.py:1961(select)
      400    0.050    0.000    4.276    0.011 relations.py:270(bind)
218900/216800    0.174    0.000    3.532    0.000 bases.py:52(__call__)
65600/65000    0.116    0.000    2.781    0.000 grounds.py:116(__create__)
      400    0.173    0.000    2.217    0.006 rewrites.py:63(from_targets)
    42700    0.194    0.000    2.079    0.000 graph.py:479(replace)
      600    0.042    0.000    1.908    0.003 relations.py:69(fields)
    21000    0.081    0.000    1.884    0.000 rewrites.py:248(rewrite_project_input)
65600/65300    0.201    0.000    1.815    0.000 annotations.py:459(validate)
    21700    0.035    0.000    1.653    0.000 rewrites.py:135(dereference)
    64800    0.062    0.000    1.219    0.000 graph.py:552(from_bfs)
    64800    0.316    0.000    1.098    0.000 graph.py:752(bfs_while)
67400/66900    0.330    0.000    0.842    0.000 grounds.py:193(__init__)
235100/234800    0.099    0.000    0.738    0.000 patterns.py:798(match)
    61600    0.066    0.000    0.689    0.000 relations.py:93(__init__)
     1500    0.067    0.000    0.648    0.000 patterns.py:1242(match)
2884549/2884545    0.584    0.000    0.638    0.000 {built-in method builtins.isinstance}
    42700    0.231    0.000    0.564    0.000 graph.py:642(toposort)

100 mutates on 100 column table after

         8570532 function calls (8548622 primitive calls) in 3.505 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      2/1    0.000    0.000    3.498    3.498 {built-in method builtins.exec}
      2/1    0.000    0.000    3.498    3.498 <string>:1(<module>)
      2/1    0.015    0.008    3.497    3.497 2926429498.py:4(build_expr)
      200    0.009    0.000    3.339    0.017 relations.py:1903(mutate)
76800/74700    0.062    0.000    2.606    0.000 bases.py:52(__call__)
65600/65000    0.105    0.000    2.501    0.000 grounds.py:116(__create__)
      400    0.008    0.000    2.381    0.006 relations.py:282(bind)
      400    0.155    0.000    1.935    0.005 rewrites.py:63(from_targets)
      600    0.038    0.000    1.735    0.003 relations.py:69(fields)
65600/65300    0.184    0.000    1.637    0.000 annotations.py:459(validate)
      200    0.010    0.000    1.537    0.008 relations.py:1999(select)
67400/66900    0.294    0.000    0.748    0.000 grounds.py:195(__init__)
235100/234800    0.090    0.000    0.658    0.000 patterns.py:801(match)
    61600    0.061    0.000    0.623    0.000 relations.py:93(__init__)
     1500    0.060    0.000    0.578    0.000 patterns.py:1245(match)
65601/65600    0.027    0.000    0.486    0.000 inspect.py:3268(bind)
65601/65600    0.300    0.000    0.459    0.000 inspect.py:3129(_bind)
   149300    0.173    0.000    0.375    0.000 core.py:44(__coerce__)
135700/134100    0.088    0.000    0.371    0.000 annotations.py:164(get_default)
     1400    0.004    0.000    0.304    0.000 rewrites.py:135(dereference)

Issues closed

Resolves perf: .mutate() on wide tables is slow #10956

- preserve order of fields in same way as existing implementation

jc-5s · 2025-07-14T16:32:20Z

sorry for clarity both accounts listed as committing here (hottwa, jc-5s) are me

kszucs · 2025-07-14T19:45:06Z

ibis/expr/types/relations.py

-        values = {**node.fields, **values}
-        return self.select(**values)
+        # kept fields from node.fields can skip dereferencing in .select() to improve performance
+        return self.select(_exprs_that_do_not_need_dereferencing=node.fields, **values)


Instead of passing a special reserved keyword argument we should rather duplicate the implementation of .select() which is actually fairly short. That way we have clear control over about what to dereference.

Something like

def mutate(self, *exprs, **mutations): from ibis.expr.rewrites import rewrite_project_input node = self.op() values = self.bind(*exprs, **mutations) values = unwrap_aliases(values) # we need to detect reductions which are either turned into window functions # or scalar subqueries depending on whether they are originating from self values = { k: rewrite_project_input(v, relation=self.op()) for k, v in values.items() } return ops.Project(self, {**node.fields, **values}).to_expr()

jc-5s · 2025-07-15T12:49:01Z

Thanks for reviewing so quickly & the comments which I've implemented. They have a bonus side effect which is that an additional call to .bind() within the removed call to .select(), which appears to be superfluous, is now omitted.

So runtime is now about 30% of original (see below for profile).

Around 50% of remaining time is spent in DerefMap.from_targets. @kszucs you suggested caching this on Tables here, which I would be in favour of if acceptable to ibis team? This would be beneficial to all table operations as well as .mutate

A typical use case we have is to run many relatively simple queries on a base table, so having the derefmap cached would be quite beneficial from our POV.

         5594429 function calls (5577914 primitive calls) in 2.248 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      2/1    0.000    0.000    2.242    2.242 {built-in method builtins.exec}
        1    0.000    0.000    2.242    2.242 <string>:1(<module>)
        1    0.013    0.013    2.240    2.240 2638618455.py:4(build_expr)
      200    0.010    0.000    2.121    0.011 relations.py:1903(mutate)
52700/51100    0.038    0.000    1.748    0.000 bases.py:52(__call__)
44300/43700    0.069    0.000    1.673    0.000 grounds.py:116(__create__)
      200    0.004    0.000    1.233    0.006 relations.py:282(bind)
      400    0.024    0.000    1.133    0.003 relations.py:69(fields)
44300/44000    0.122    0.000    1.072    0.000 annotations.py:459(validate)
      200    0.074    0.000    0.980    0.005 rewrites.py:63(from_targets)
45200/44800    0.196    0.000    0.518    0.000 grounds.py:195(__init__)
    41300    0.041    0.000    0.416    0.000 relations.py:93(__init__)
150600/150300    0.057    0.000    0.410    0.000 patterns.py:801(match)
     1000    0.039    0.000    0.365    0.000 patterns.py:1245(match)
44301/44300    0.018    0.000    0.321    0.000 inspect.py:3268(bind)
44301/44300    0.197    0.000    0.302    0.000 inspect.py:3129(_bind)
90700/89300    0.060    0.000    0.268    0.000 annotations.py:164(get_default)
    85900    0.100    0.000    0.224    0.000 core.py:44(__coerce__)
     1400    0.014    0.000    0.178    0.000 graph.py:479(replace)
      700    0.002    0.000    0.154    0.000 rewrites.py:135(dereference)

jc-5s · 2025-07-15T14:21:32Z

if Derefmaps are cached, then their memory requirements for a length D chain of operations with N fields is O(N*D^2), and each new derefmap on the chain takes incrementally more time to build than the previous one (that's already the case though), so that's probably not a great way forward

Another approach would be to only cache 1 layer depth of derefs on each table, and then do something like:

while expr.table != self:  # assumes expressions have some concept of the table they are bound to
    expr = self._deref_maps_by_src_table[expr.table][expr]

this would have following characteristics:

derefmaps would be quicker to build, only 1 layer depth DFS is needed for each table
they would be cached per table and cost N fields per table, so memory requirements would be better, O(N*D)
expressions from "closer" tables would be faster to dereference. expressions from tables N steps away would require N lookups
- but optimistically, hopefully users normally use "close" relations when writing queries rather than distant ones - though I have no evidence :)

would that be worth considering?

kszucs · 2025-07-15T15:55:51Z

ibis/expr/types/relations.py

        │ Adelie  │  2007 │       -7.22193 │
        └─────────┴───────┴────────────────┘
        """
+        from ibis.expr.rewrites import rewrite_project_input


Can you please add a comment here to keep the implementation of .mutate() and .select() in sync except for the **node.fields part? This might help preserving feature parity between the two methods.

kszucs · 2025-07-15T15:57:52Z

@jc-5s yes, we should find a better way to make dereferencing less resource hungry. Your described solution sounds good on principle but there are a lot of details we need to pay attention to. Feel free to put up a follow-up PR then I will take a look.

kszucs · 2025-07-15T15:58:07Z

ibis/expr/types/relations.py

            value = value.name(original.get_name())
            result.append(value)
-        return tuple(result)
+        return tuple(result)        


Extra trailing whitespace.

kszucs

LGTM from my side.

@cpcloud please merge when you have the chance.

jc-5s · 2025-07-15T16:23:05Z

thanks again, I've made the linting fixes and added comments to select and mutate about keeping them in sync

will come back to the DerefMap idea at another point as time allows :)

to fix the "conventional commits" issue, is it best to e.g. squash the PR and follow style e.g. here?
https://gist.github.com/qoomon/5dfcdf8eec66a051ecd85625518cfd13

Just LMK if anything else needed. Thanks!

cpcloud · 2025-07-16T10:51:56Z

Thanks @hottwaj and @kszucs!

jc-5s and others added 4 commits July 12, 2025 14:59

- WIP

7e4f24e

- simplify - changes to .bind are not needed

977d3f7

- preserve order of fields in same way as existing implementation

Merge branch 'ibis-project:main' into perf-bind-on-wide-tables

0bc3834

- dummy commit to try and trigger github actions :/

b824d1f

JonAnCla mentioned this pull request Jul 14, 2025

perf: .mutate() on wide tables is slow #10956

Closed

kszucs reviewed Jul 14, 2025

View reviewed changes

- undo changes to .select and move equivalent steps into .mutate

813f4ec

kszucs reviewed Jul 15, 2025

View reviewed changes

kszucs approved these changes Jul 15, 2025

View reviewed changes

- linting fixes

cdd6471

cpcloud changed the title ~~Perf: skip dereferencing cols preserved in .mutate~~ perf(mutate): skip dereferencing cols preserved in .mutate Jul 16, 2025

cpcloud merged commit 70de1a1 into ibis-project:main Jul 16, 2025
109 of 112 checks passed

cpcloud added the performance Issues related to ibis's performance label Jul 16, 2025

JonAnCla mentioned this pull request Aug 8, 2025

refactor(binding): make DerefMap computation lazy and support multiple value inputs #11540

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(mutate): skip dereferencing cols preserved in .mutate #11458

perf(mutate): skip dereferencing cols preserved in .mutate #11458

Uh oh!

JonAnCla commented Jul 14, 2025

Uh oh!

jc-5s commented Jul 14, 2025

Uh oh!

kszucs Jul 14, 2025

Uh oh!

kszucs Jul 14, 2025

Uh oh!

jc-5s commented Jul 15, 2025

Uh oh!

jc-5s commented Jul 15, 2025

Uh oh!

kszucs Jul 15, 2025

Uh oh!

kszucs commented Jul 15, 2025

Uh oh!

kszucs Jul 15, 2025

Uh oh!

kszucs left a comment

Uh oh!

jc-5s commented Jul 15, 2025

Uh oh!

Uh oh!

cpcloud commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

perf(mutate): skip dereferencing cols preserved in .mutate #11458

perf(mutate): skip dereferencing cols preserved in .mutate #11458

Uh oh!

Conversation

JonAnCla commented Jul 14, 2025

Description of changes

Issues closed

Uh oh!

jc-5s commented Jul 14, 2025

Uh oh!

kszucs Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

kszucs Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

jc-5s commented Jul 15, 2025

Uh oh!

jc-5s commented Jul 15, 2025

Uh oh!

kszucs Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

kszucs commented Jul 15, 2025

Uh oh!

kszucs Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

kszucs left a comment

Choose a reason for hiding this comment

Uh oh!

jc-5s commented Jul 15, 2025

Uh oh!

Uh oh!

cpcloud commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants