-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic when running UNION on SHOW BACKUP queries as non-root user #88385
Comments
Thanks for reporting this issue, it's a very interesting bug. |
I think the fundamental problem here is that:
In any case synchronizing the goroutines to make sure they run in a mutually-exclusive manner for each transaction seems to be the correct thing to do. |
Okay, this one is good! The root cause, in effect, is the introduction in 22.2 to check the privilege on the cloud URL here: cockroach/pkg/ccl/backupccl/show.go Lines 279 to 281 in ef4656f
The problem is that this privilege check happens in the returned callback and not in the setup function. The returned callback is called in parallel with other parts of execution, and is not safe to concurrently resolve descriptors. Resolving descriptors needs to happen during the planning part of the plan hook. Now, there's some complexity here in two dimensions:
If we did that, then we'd be accessing the collection during planning time and not execution time, as it was intended. I don't have enough context above on how these |
cc @cockroachdb/disaster-recovery |
I believe this is impossible: URIs can appear in placeholders, so we don't have their values until eval? |
It seems to me like if we lease the relevant descriptors (which are known and finite) then, if I understand correctly, we won't need to resolve anything new during execution. |
How do we know the descriptors though? like, just lease all possible synthetic / privilege descriptors? Until we look at the URI, we don't know if it is a named sink, or a userfile? If it ends up userfile, we need to do priv checks on the table being written to, which can be any table name so it seems like we'd need to have leased every table? |
The synthetic privileges aren't themselves descriptors. They are data in a table and there's a cache associated with that data. We bump the table descriptor version for the table when we write to it in order to achieve cache coherence. This is the same pattern followed also for users and user_roles. There's a constant number of descriptors to lease (2?) |
But what if the uri is to a userfile location? Until we parse the host we don't even know what table name to resolve let along priv check |
We always have to resolve the same two tables |
Okay, that's a problem. I think @dt is noting that the issue here is that we also need to resolve the userfile table. I don't really know how that happens. Fundamentally, we have an issue here. The planning layer is not safe for concurrent access and the catalog is part of the planning layer. |
The way that the optimizer works here is that it stores all of the dependencies in its prepared plan and then it checks those dependencies and re-plans in the context of the planner. I think the good, or bad, news is that we already had the userfile bug and it's pretty deep, and people don't seem to use userfile much, let alone in the context of a |
Prior to this change, plan hooks would be invokved twice, once during prepare and again during execution. At prepare time, the placeholder values are not known, but, critically, the placeholders were being type-checked. This type checking informs the sql layer how to interpret the placeholder values when we are trying to bind them. Importantly, the sql layer does not ever use the `planNode` produced via the `PlanHookRowFn` returned from the planning invocation. The above dance meant that these plan hooks had quite a bit of complexity to eagerly type check but lazily evaluate expressions passed in via the statement. This complexity became a problem more recently: one cannot and should not access many planner data structures from the row func because such functions can, theoretically, run concurrently and these data structures are not thread safe. Note that the above problem is not commonly a real problem. Most plan hook statements don't permit concurrency and those than do, via, say, `UNION ALL` don't commonly get executed in such a setting. In some ways, the real reason for this change is to unwind some unneeded complexity and clarify the bounds of what one can do in a plan hook. The big change which unearthed the problem is that we now support privileges on external URLs. These URLs may be provided via placeholders or expressions. If we couldn't access the expressions until we were in the row fn, then we'd be at risk of performing privilege checking concurrently, which is not safe. One thing that this commit does is introduce a new package `evalexpr` with helpers for type checking. It is so named because I'd like to also remove the methods from the `PlanHookState` which we now use for expr evaluation and place them in this new package. I don't love the name. Fixes cockroachdb#88385 Release note: None
Prior to this change, plan hooks would be invokved twice, once during prepare and again during execution. At prepare time, the placeholder values are not known, but, critically, the placeholders were being type-checked. This type checking informs the sql layer how to interpret the placeholder values when we are trying to bind them. Importantly, the sql layer does not ever use the `planNode` produced via the `PlanHookRowFn` returned from the planning invocation. The above dance meant that these plan hooks had quite a bit of complexity to eagerly type check but lazily evaluate expressions passed in via the statement. This complexity became a problem more recently: one cannot and should not access many planner data structures from the row func because such functions can, theoretically, run concurrently and these data structures are not thread safe. Note that the above problem is not commonly a real problem. Most plan hook statements don't permit concurrency and those than do, via, say, `UNION ALL` don't commonly get executed in such a setting. In some ways, the real reason for this change is to unwind some unneeded complexity and clarify the bounds of what one can do in a plan hook. The big change which unearthed the problem is that we now support privileges on external URLs. These URLs may be provided via placeholders or expressions. If we couldn't access the expressions until we were in the row fn, then we'd be at risk of performing privilege checking concurrently, which is not safe. One thing that this commit does is introduce a new package `evalexpr` with helpers for type checking. It is so named because I'd like to also remove the methods from the `PlanHookState` which we now use for expr evaluation and place them in this new package. I don't love the name. Fixes cockroachdb#88385 Release note: None
Prior to this change, plan hooks would be invokved twice, once during prepare and again during execution. At prepare time, the placeholder values are not known, but, critically, the placeholders were being type-checked. This type checking informs the sql layer how to interpret the placeholder values when we are trying to bind them. Importantly, the sql layer does not ever use the `planNode` produced via the `PlanHookRowFn` returned from the planning invocation. The above dance meant that these plan hooks had quite a bit of complexity to eagerly type check but lazily evaluate expressions passed in via the statement. This complexity became a problem more recently: one cannot and should not access many planner data structures from the row func because such functions can, theoretically, run concurrently and these data structures are not thread safe. Note that the above problem is not commonly a real problem. Most plan hook statements don't permit concurrency and those than do, via, say, `UNION ALL` don't commonly get executed in such a setting. In some ways, the real reason for this change is to unwind some unneeded complexity and clarify the bounds of what one can do in a plan hook. The big change which unearthed the problem is that we now support privileges on external URLs. These URLs may be provided via placeholders or expressions. If we couldn't access the expressions until we were in the row fn, then we'd be at risk of performing privilege checking concurrently, which is not safe. One thing that this commit does is introduce a new package `exprutil` with helpers for type checking. It is so named because I'd like to also remove the methods from the `PlanHookState` which we now use for expr evaluation and place them in this new package. Fixes cockroachdb#88385 Release note: None
Prior to this change, plan hooks would be invokved twice, once during prepare and again during execution. At prepare time, the placeholder values are not known, but, critically, the placeholders were being type-checked. This type checking informs the sql layer how to interpret the placeholder values when we are trying to bind them. Importantly, the sql layer does not ever use the `planNode` produced via the `PlanHookRowFn` returned from the planning invocation. The above dance meant that these plan hooks had quite a bit of complexity to eagerly type check but lazily evaluate expressions passed in via the statement. This complexity became a problem more recently: one cannot and should not access many planner data structures from the row func because such functions can, theoretically, run concurrently and these data structures are not thread safe. Note that the above problem is not commonly a real problem. Most plan hook statements don't permit concurrency and those than do, via, say, `UNION ALL` don't commonly get executed in such a setting. In some ways, the real reason for this change is to unwind some unneeded complexity and clarify the bounds of what one can do in a plan hook. The big change which unearthed the problem is that we now support privileges on external URLs. These URLs may be provided via placeholders or expressions. If we couldn't access the expressions until we were in the row fn, then we'd be at risk of performing privilege checking concurrently, which is not safe. One thing that this commit does is introduce a new package `exprutil` with helpers for type checking. It is so named because I'd like to also remove the methods from the `PlanHookState` which we now use for expr evaluation and place them in this new package. Fixes cockroachdb#88385 Release note: None
90254: sql,*: rework plan hooks to type check separately from execution r=ajwerner a=ajwerner Prior to this change, plan hooks would be invoked twice, once during prepare and again during execution. At prepare time, the placeholder values are not known, but, critically, the placeholders were being type-checked. This type checking informs the sql layer how to interpret the placeholder values when we are trying to bind them. Importantly, the sql layer does not ever use the `planNode` produced via the `PlanHookRowFn` returned from the planning invocation. The above dance meant that these plan hooks had quite a bit of complexity to eagerly type check but lazily evaluate expressions passed in via the statement. This complexity became a problem more recently: one cannot and should not access many planner data structures from the row func because such functions can, theoretically, run concurrently and these data structures are not thread safe. Note that the above problem is not commonly a real problem. Most plan hook statements don't permit concurrency and those than do, via, say, `UNION ALL` don't commonly get executed in such a setting. In some ways, the real reason for this change is to unwind some unneeded complexity and clarify the bounds of what one can do in a plan hook. The big change which unearthed the problem is that we now support privileges on external URLs. These URLs may be provided via placeholders or expressions. If we couldn't access the expressions until we were in the row fn, then we'd be at risk of performing privilege checking concurrently, which is not safe. One thing that this commit does is introduce a new package `exprutil` with helpers for type checking. It is so named because I'd like to also remove the methods from the `PlanHookState` which we now use for expr evaluation and place them in this new package. Fixes #88385 Release note: None Co-authored-by: Andrew Werner <awerner32@gmail.com>
Prior to this change, plan hooks would be invokved twice, once during prepare and again during execution. At prepare time, the placeholder values are not known, but, critically, the placeholders were being type-checked. This type checking informs the sql layer how to interpret the placeholder values when we are trying to bind them. Importantly, the sql layer does not ever use the `planNode` produced via the `PlanHookRowFn` returned from the planning invocation. The above dance meant that these plan hooks had quite a bit of complexity to eagerly type check but lazily evaluate expressions passed in via the statement. This complexity became a problem more recently: one cannot and should not access many planner data structures from the row func because such functions can, theoretically, run concurrently and these data structures are not thread safe. Note that the above problem is not commonly a real problem. Most plan hook statements don't permit concurrency and those than do, via, say, `UNION ALL` don't commonly get executed in such a setting. In some ways, the real reason for this change is to unwind some unneeded complexity and clarify the bounds of what one can do in a plan hook. The big change which unearthed the problem is that we now support privileges on external URLs. These URLs may be provided via placeholders or expressions. If we couldn't access the expressions until we were in the row fn, then we'd be at risk of performing privilege checking concurrently, which is not safe. One thing that this commit does is introduce a new package `exprutil` with helpers for type checking. It is so named because I'd like to also remove the methods from the `PlanHookState` which we now use for expr evaluation and place them in this new package. Fixes cockroachdb#88385 Release note: None
Prior to this change, plan hooks would be invokved twice, once during prepare and again during execution. At prepare time, the placeholder values are not known, but, critically, the placeholders were being type-checked. This type checking informs the sql layer how to interpret the placeholder values when we are trying to bind them. Importantly, the sql layer does not ever use the `planNode` produced via the `PlanHookRowFn` returned from the planning invocation. The above dance meant that these plan hooks had quite a bit of complexity to eagerly type check but lazily evaluate expressions passed in via the statement. This complexity became a problem more recently: one cannot and should not access many planner data structures from the row func because such functions can, theoretically, run concurrently and these data structures are not thread safe. Note that the above problem is not commonly a real problem. Most plan hook statements don't permit concurrency and those than do, via, say, `UNION ALL` don't commonly get executed in such a setting. In some ways, the real reason for this change is to unwind some unneeded complexity and clarify the bounds of what one can do in a plan hook. The big change which unearthed the problem is that we now support privileges on external URLs. These URLs may be provided via placeholders or expressions. If we couldn't access the expressions until we were in the row fn, then we'd be at risk of performing privilege checking concurrently, which is not safe. One thing that this commit does is introduce a new package `exprutil` with helpers for type checking. It is so named because I'd like to also remove the methods from the `PlanHookState` which we now use for expr evaluation and place them in this new package. Fixes cockroachdb#88385 Release note: None
Describe the problem
If a query issued by a non-root user UNIONs the result of multiple SHOW BACKUP queries, then CRDB crashes with an exit code of 2.
To Reproduce
Expected behavior
There should be no crash!
Additional data / screenshots
Log directory: https://upload.cockroachlabs.com/receive/?packageCode=M1KruQE98L65Cm1oy6GPshIXN29XnAdZeaRNvTreTgw#keycode=GgD909dEchER8yB1YbmTUiksTlrL24aaRknToo7MrXk
Environment:
Additional context
This was discovered while running some performance tests on backup queries.
Jira issue: CRDB-19797
The text was updated successfully, but these errors were encountered: