Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add SQL queries support in /v1/sql endpoint #9301

Merged
merged 7 commits into from
Mar 8, 2025

Conversation

mcheshkov
Copy link
Member

@mcheshkov mcheshkov commented Mar 4, 2025

Check List

  • Tests have been run in packages where changes made if available
  • Linter has been run for changed code
  • Tests for the changes have been added if not covered yet
  • Docs have been added / updated if required

Description of Changes Made (if issue reference is not provided)

/v1/sql can now generate SQL for SQL API queries. It errors out for meta-only queries (like CREATE TEMPORARY TABLE) and post-processing queries (like SELECT version();), and works only when logical plan root is either CubeScan or CubeScanWrappedSql

Copy link

codecov bot commented Mar 4, 2025

Codecov Report

Attention: Patch coverage is 68.18182% with 7 lines in your changes missing coverage. Please review.

Project coverage is 83.75%. Comparing base (894cfd8) to head (a8708e3).
Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
rust/cubesql/cubesql/src/compile/plan.rs 55.55% 4 Missing ⚠️
rust/cubesql/cubesql/src/compile/parser.rs 72.72% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9301      +/-   ##
==========================================
- Coverage   83.75%   83.75%   -0.01%     
==========================================
  Files         229      229              
  Lines       82604    82613       +9     
==========================================
+ Hits        69186    69191       +5     
- Misses      13418    13422       +4     
Flag Coverage Δ
cubesql 83.75% <68.18%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mcheshkov mcheshkov force-pushed the sql-endpoint-for-sql-api branch from ba68574 to 29970e1 Compare March 4, 2025 20:43
@mcheshkov mcheshkov marked this pull request as ready for review March 5, 2025 10:08
@mcheshkov mcheshkov requested review from a team as code owners March 5, 2025 10:08
@@ -1272,6 +1295,25 @@ class ApiGateway {
return [queryType, normalizedQueries, queryNormalizationResult.map((it) => remapToQueryAdapterFormat(it.normalizedQuery))];
}

public async sql4sql({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be private?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I'm not sure.
I just copy-pasted public async sql({, but I see no reason to keep this public

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, these methods are public because it can be called from SubscriptionServer, it's used in websocket protocol.

@@ -425,6 +631,60 @@ fn exec_sql(mut cx: FunctionContext) -> JsResult<JsValue> {
Ok(promise.upcast::<JsValue>())
}

fn sql4sql(mut cx: FunctionContext) -> JsResult<JsValue> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you mind moving all these functions to sub module (like it was done for orchestrator/planner/python)? - To keep this file clean and hold only kind of entry points, register interfaces, while having all the handlers in a separate sub modules.

Comment on lines 137 to 147
"sql": "SELECT
sum(\\"orders\\".amount) \\"total\\"
FROM
(
select 1 as id, 100 as amount, 'new' status, '2024-01-01'::timestamptz created_at
UNION ALL
select 2 as id, 200 as amount, 'new' status, '2024-01-02'::timestamptz created_at
UNION ALL
select 3 as id, 300 as amount, 'processed' status, '2024-01-03'::timestamptz created_at
UNION ALL
select 4 as id, 500 as amount, 'processed' status, '2024-01-04'::timestamptz created_at
UNION ALL
select 5 as id, 600 as amount, 'shipped' status, '2024-01-05'::timestamptz created_at
) AS \\"orders\\" WHERE (\\"orders\\".status = $1)",
"values": Array [
"foo",
],
Copy link
Member

@igorlukanin igorlukanin Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcheshkov I think we're breaking the existing contract here. I thought that sql.sql will be a two element array, with first element being a string, second being an array of parameters.

Here's what /sql currently returns:

{
  "sql": {
    "sql": [
      "SELECT\n      `products`.product_category `orders__product_category`, date_trunc('year', from_utc_timestamp(`base_orders`.created_at::TIMESTAMP, 'UTC')) `orders__created_at_year`\n    FROM\n      hive_metastore.default.orders AS `base_orders`\nLEFT JOIN hive_metastore.default.line_items AS `line_items` ON `base_orders`.id = `line_items`.order_id\nLEFT JOIN hive_metastore.default.products AS `products` ON `line_items`.product_id = `products`.id  WHERE (`base_orders`.created_at::TIMESTAMP >= from_utc_timestamp(replace(replace(?, 'T', ' '), 'Z', ''), 'UTC') AND `base_orders`.created_at::TIMESTAMP <= from_utc_timestamp(replace(replace(?, 'T', ' '), 'Z', ''), 'UTC')) GROUP BY 1, 2 ORDER BY 2 ASC LIMIT 10000",
      [
        "2025-03-06T00:00:00.000Z",
        "2025-03-06T23:59:59.999Z"
      ]
    ],
// skipped

The current way is definitely ugly but I bet we should not break it for no good reason.

@mcheshkov mcheshkov merged commit 7eba663 into master Mar 8, 2025
70 checks passed
@mcheshkov mcheshkov deleted the sql-endpoint-for-sql-api branch March 8, 2025 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants