-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: prevent use of leaf txn with volatile routines #113254
base: master
Are you sure you want to change the base?
Conversation
Note that when a UDF is called like this:
Concurrency already ends up being disabled because it gets planned in a values planNode, which disables the streamer. |
cd3195c
to
dc25912
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! I have some comments.
Reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @mgartner)
-- commits
line 14 at r1:
Do you think this could also be a fix to #111097 which we couldn't repro?
pkg/sql/distsql_running.go
line 709 at r1 (raw file):
// then we are ignorant of the details of the execution plan, so we choose // to be on the safe side and mark 'noMutations' as 'false'. noMutations := planCtx.planner != nil && !planCtx.planner.curPlan.flags.IsSet(planFlagContainsMutation)
nit: I think we can remove noMutations
now because it's already included into canUseLeafTxn
.
pkg/sql/distsql_running.go
line 745 at r1 (raw file):
// processors to figure out whether any of them have concurrency. // // However, the concurrency requires the usage of LeafTxns which is
nit: this paragraph of the comment should be moved to be right above us looking planFlagContainsMutation
.
pkg/sql/distsql_running.go
line 769 at r1 (raw file):
containsNonDefaultLocking := planCtx.planner != nil && planCtx.planner.curPlan.flags.IsSet(planFlagContainsNonDefaultLocking) // We also currently disable the usage of the Streamer API whenever
nit: move this comment to be right above the just moved for
loop.
pkg/sql/plan.go
line 637 at r1 (raw file):
// concurrency. Note that planFlagUseRootTxn is not the only condition that // forces use of the root transaction. planFlagMustUseRootTxn
This new flag should also be checked in NewPlanningCtxWithOracle
.
pkg/sql/opt/exec/execbuilder/relational.go
line 3169 at r1 (raw file):
// Volatile routines step the transaction, and routines with an exception // block use internal savepoints. if udf.Def.Volatility == volatility.Volatile || udf.Def.ExceptionBlock != nil {
nit: the indentation seems a bit off.
pkg/sql/opt/props/func_dep.go
line 667 at r1 (raw file):
// ComputeClosureNoCopy is similar to ComputeClosure, but avoids allocations // when it is safe to mutate the given ColSet. func (f *FuncDepSet) ComputeClosureNoCopy(cols opt.ColSet) opt.ColSet {
nit: I don't see the new function being used.
pkg/sql/logictest/testdata/logic_test/udf
line 767 at r1 (raw file):
# has a volatile UDF. query T EXPLAIN ANALYZE
nit: we try to keep EXPLAIN ANALYZE stmts in execbuilder tests. Alternatively, consider not printing out the whole thing, but doing a filter message LIKE 'lookup join'
which doesn't have ((streamer)
) in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once you address Yahor's comments.
Reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @yuzefovich)
pkg/sql/logictest/testdata/logic_test/udf
line 767 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: we try to keep EXPLAIN ANALYZE stmts in execbuilder tests. Alternatively, consider not printing out the whole thing, but doing a filter
message LIKE 'lookup join'
which doesn't have ((streamer)
) in it.
+1 to moving this to an execbuilder test
TODO: add assertion that we are using a root txn if stepping. |
dc25912
to
8248234
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added test-only assertions that we are using a RootTxn when we step the txn as well as when we handle an exception.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Do you think this could also be a fix to #111097 which we couldn't repro?
Hm, I don't think so - all the routines in that file seem to contain mutations, so I'd expect planFlagContainsMutation
to be set.
pkg/sql/distsql_running.go
line 709 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think we can remove
noMutations
now because it's already included intocanUseLeafTxn
.
Done.
pkg/sql/distsql_running.go
line 713 at r1 (raw file):
// canUseLeafTxn indicates whether the plan contains an expression that // cannot tolerate a concurrent leaf transaction. canUseLeafTxn := true
@yuzefovich I've moved all the "can use LeafTxn" logic up to here, and added a test-only assertion that localState.MustUseLeafTxn()
is false if !canUseLeafTxn
.
pkg/sql/distsql_running.go
line 745 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: this paragraph of the comment should be moved to be right above us looking
planFlagContainsMutation
.
Done.
pkg/sql/distsql_running.go
line 769 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: move this comment to be right above the just moved
for
loop.
Done.
pkg/sql/plan.go
line 637 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
This new flag should also be checked in
NewPlanningCtxWithOracle
.
Done.
pkg/sql/opt/exec/execbuilder/relational.go
line 3169 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: the indentation seems a bit off.
Done.
pkg/sql/opt/props/func_dep.go
line 667 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I don't see the new function being used.
Oops, removed that change.
pkg/sql/logictest/testdata/logic_test/udf
line 767 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
+1 to moving this to an execbuilder test
Done.
8248234
to
70b2e65
Compare
Hm I'm seeing |
This patch disallows leaf transactions during execution for two cases: 1. The plan contains a volatile routine. 2. The plan contains a routine with a PLpgSQL exception handler. This is necessary because volatile routines use transaction stepping, and exception handlers use internal savepoints. Both of these cases are incompatible with concurrent leaf transactions, and may cause internal errors is not handled. Note that (1) implies (2) currently, but (1) may be relaxed in the future, so we explicitly prevent both to be defensive. Fixes cockroachdb#112188 Release note (bug fix): Fixed a rare bug that could result in an internal error when executing a volatile UDF in a query along with a lookup join.
70b2e65
to
2130596
Compare
It looks like this is related to things like index back-filling, which don't set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but the build is still red.
Reviewed 12 of 13 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)
pkg/sql/distsql_running.go
line 713 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
@yuzefovich I've moved all the "can use LeafTxn" logic up to here, and added a test-only assertion that
localState.MustUseLeafTxn()
is false if!canUseLeafTxn
.
nit: we already have a related assertion at the bottom of Run
method - consider moving it there.
pkg/sql/opt/exec/execbuilder/testdata/udf
line 389 at r3 (raw file):
# has a volatile UDF. query T EXPLAIN ANALYZE
nit: since we're only interested in one line (whether we have lookup join
or lookup join (streamer)
), we could only count the number of rows matching the necessary filter to avoid printing out the whole thing. (Just saw that I left a similar comment last time - up to you.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 12 of 13 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball)
-- commits
line 11 at r3:
nit: fix the typo "... may cause internal errors is not handled".
pkg/sql/routine.go
line 273 at r3 (raw file):
if buildutil.CrdbTestBuild { if txn.Type() != kv.RootTxn { return errors.AssertionFailedf("routine: cannot step LeafTxn")
Should we return this error in non-test builds too, since it will likely fail later on in execution?
pkg/sql/opt/exec/execbuilder/relational.go
line 3374 at r3 (raw file):
// block use internal savepoints. if udf.Def.Volatility == volatility.Volatile || udf.Def.ExceptionBlock != nil { b.flags.Set(exec.PlanFlagMustUseRootTxn)
Below we always pass enableStepping=true
, since procedures have no user-defined volatility, and I believe statements within a procedure can always observe side effects of previously evaluated statements. So I think we should unconditionally set this plan flag here.
pkg/sql/opt/exec/execbuilder/scalar.go
line 957 at r3 (raw file):
// Volatile routines step the transaction, and routines with an exception // block use internal savepoints.
nit: move this below next to enableStepping := ...
. Or maybe a better option is to set the flag in the if udf.Def.ExceptionBlock { ... }
block below and also after enableStepping := ...
in an if enableStepping { ... }
block.
pkg/sql/opt/exec/execbuilder/scalar.go
line 980 at r3 (raw file):
// see mutations made by the invoking statement and by previously executed // statements. enableStepping := udf.Def.Volatility == volatility.Volatile
nit: move this below the if udf.Def.ExceptionBlock != nil { ... }
block since it's not used for that code path.
nit: It might be easier to see that we don't need to set the flag in this case (because |
@DrewKimball is this something we need to pick back up and get into 24.1? |
We should backport this, but it won't be a regression in 24.1, so not urgent. |
@DrewKimball How important do you think this is? Happy to work together to get it across the finish line. |
This patch disallows leaf transactions during execution for two cases:
This is necessary because volatile routines use transaction stepping, and exception handlers use internal savepoints. Both of these cases are incompatible with concurrent leaf transactions, and may cause internal errors is not handled. Note that (1) implies (2) currently, but (1) may be relaxed in the future, so we explicitly prevent both to be defensive.
Fixes #112188
Release note (bug fix): Fixed a rare bug that could result in an internal error when executing a volatile UDF in a query along with a lookup join.