-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add SQL queries support in /v1/sql endpoint #9301
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9301 +/- ##
==========================================
- Coverage 83.75% 83.75% -0.01%
==========================================
Files 229 229
Lines 82604 82613 +9
==========================================
+ Hits 69186 69191 +5
- Misses 13418 13422 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ba68574
to
29970e1
Compare
@@ -1272,6 +1295,25 @@ class ApiGateway { | |||
return [queryType, normalizedQueries, queryNormalizationResult.map((it) => remapToQueryAdapterFormat(it.normalizedQuery))]; | |||
} | |||
|
|||
public async sql4sql({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I'm not sure.
I just copy-pasted public async sql({
, but I see no reason to keep this public
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, these methods are public
because it can be called from SubscriptionServer
, it's used in websocket
protocol.
@@ -425,6 +631,60 @@ fn exec_sql(mut cx: FunctionContext) -> JsResult<JsValue> { | |||
Ok(promise.upcast::<JsValue>()) | |||
} | |||
|
|||
fn sql4sql(mut cx: FunctionContext) -> JsResult<JsValue> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you mind moving all these functions to sub module (like it was done for orchestrator/planner/python)? - To keep this file clean and hold only kind of entry points, register interfaces, while having all the handlers in a separate sub modules.
"sql": "SELECT | ||
sum(\\"orders\\".amount) \\"total\\" | ||
FROM | ||
( | ||
select 1 as id, 100 as amount, 'new' status, '2024-01-01'::timestamptz created_at | ||
UNION ALL | ||
select 2 as id, 200 as amount, 'new' status, '2024-01-02'::timestamptz created_at | ||
UNION ALL | ||
select 3 as id, 300 as amount, 'processed' status, '2024-01-03'::timestamptz created_at | ||
UNION ALL | ||
select 4 as id, 500 as amount, 'processed' status, '2024-01-04'::timestamptz created_at | ||
UNION ALL | ||
select 5 as id, 600 as amount, 'shipped' status, '2024-01-05'::timestamptz created_at | ||
) AS \\"orders\\" WHERE (\\"orders\\".status = $1)", | ||
"values": Array [ | ||
"foo", | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcheshkov I think we're breaking the existing contract here. I thought that sql.sql
will be a two element array, with first element being a string, second being an array of parameters.
Here's what /sql
currently returns:
{
"sql": {
"sql": [
"SELECT\n `products`.product_category `orders__product_category`, date_trunc('year', from_utc_timestamp(`base_orders`.created_at::TIMESTAMP, 'UTC')) `orders__created_at_year`\n FROM\n hive_metastore.default.orders AS `base_orders`\nLEFT JOIN hive_metastore.default.line_items AS `line_items` ON `base_orders`.id = `line_items`.order_id\nLEFT JOIN hive_metastore.default.products AS `products` ON `line_items`.product_id = `products`.id WHERE (`base_orders`.created_at::TIMESTAMP >= from_utc_timestamp(replace(replace(?, 'T', ' '), 'Z', ''), 'UTC') AND `base_orders`.created_at::TIMESTAMP <= from_utc_timestamp(replace(replace(?, 'T', ' '), 'Z', ''), 'UTC')) GROUP BY 1, 2 ORDER BY 2 ASC LIMIT 10000",
[
"2025-03-06T00:00:00.000Z",
"2025-03-06T23:59:59.999Z"
]
],
// skipped
The current way is definitely ugly but I bet we should not break it for no good reason.
dd2efc4
to
ebca109
Compare
ebca109
to
6e68b86
Compare
Check List
Description of Changes Made (if issue reference is not provided)
/v1/sql
can now generate SQL for SQL API queries. It errors out for meta-only queries (likeCREATE TEMPORARY TABLE
) and post-processing queries (likeSELECT version();
), and works only when logical plan root is eitherCubeScan
orCubeScanWrappedSql