Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue: Support influxql raw query #698

Closed
19 tasks done
Rachelint opened this issue Mar 6, 2023 · 5 comments
Closed
19 tasks done

Tracking Issue: Support influxql raw query #698

Rachelint opened this issue Mar 6, 2023 · 5 comments
Labels
feature New feature or request

Comments

@Rachelint
Copy link
Contributor

Rachelint commented Mar 6, 2023

Describe This Problem

We plan to support influxql mainly by converting it to sql in the statement level.

So the first thing to do is to process the difference between influxql and sql.

Difference is listed as following:

General

  • Only string type tag, and string、int64、uint64、float64、boolean type field allowed.

Projection clause

select *::tag;
select * group by *

no tags will be in the project in the result.
  • Regex and special wilcards in function call.
  • All above in the subquery.

From clause

  • Support regex.
  • Support from multiple measurements(Not supported in short term).

Group by clause

select a from t group by *::tag / *
  • Group by without aggregation.

Order by clause

Proposal

  • We should process the wildcard and regex by rewriting the influxql statement.
  • Then we convert it to sql statement.
  • Finally, throw it into datafusion.

Additional Context

@jiacai2050
Copy link
Contributor

Other difference I found

Project clause

Column name can be quoted in double quotes

SELECT "level description","location","water_level" FROM "h2o_feet"

It will output string directly in SQL.

Source: https://docs.influxdata.com/influxdb/v1.8/query_language/explore-data/

@jiacai2050
Copy link
Contributor

jiacai2050 commented Mar 7, 2023

As for time() in groupby

> SELECT COUNT("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m),"location"

name: h2o_feet
tags: location=coyote_creek
time                   count
----                   -----
2015-08-18T00:00:00Z   2
2015-08-18T00:12:00Z   2
2015-08-18T00:24:00Z   2

name: h2o_feet
tags: location=santa_monica
time                   count
----                   -----
2015-08-18T00:00:00Z   2
2015-08-18T00:12:00Z   2
2015-08-18T00:24:00Z   2

We can rewrite it to SQL, something like

SELECT
  to_timestamp_millis(DATE_BIN (INTERVAL '12' minute, timestamp, 0::TIMESTAMP) ) AS groupby_time,
  location,
  COUNT(`water_level`)  
FROM
    "h2o_feet"
WHERE
    timestamp >= '2015-08-18T00:00:00Z'
    AND timestamp <= '2015-08-18T00:30:00Z'
GROUP BY
    groupby_time,
    location;
    

Plan to write an optimize rule to do this.

Question: does sql parser support time(12m) in group by clause?

@Rachelint
Copy link
Contributor Author

As for time() in groupby

* https://docs.influxdata.com/influxdb/v1.8/query_language/explore-data/#group-query-results-into-12-minutes-intervals-and-by-a-tag-key
> SELECT COUNT("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m),"location"

name: h2o_feet
tags: location=coyote_creek
time                   count
----                   -----
2015-08-18T00:00:00Z   2
2015-08-18T00:12:00Z   2
2015-08-18T00:24:00Z   2

name: h2o_feet
tags: location=santa_monica
time                   count
----                   -----
2015-08-18T00:00:00Z   2
2015-08-18T00:12:00Z   2
2015-08-18T00:24:00Z   2

We can rewrite it to SQL, something like

SELECT
  to_timestamp_millis(DATE_BIN (INTERVAL '12' minute, timestamp, 0::TIMESTAMP) ) AS groupby_time,
  location,
  COUNT(`water_level`)  
FROM
    "h2o_feet"
WHERE
    timestamp >= '2015-08-18T00:00:00Z'
    AND timestamp <= '2015-08-18T00:30:00Z'
GROUP BY
    groupby_time,
    location;
    

Plan to write an optimize rule to do this.

Question: does sql parser support time(12m) in group by clause?

OK, I add them to the diffrence list.

@ShiKaiWi ShiKaiWi pinned this issue Mar 9, 2023
@ShiKaiWi ShiKaiWi changed the title Support influxql raw query Tracking Issue: Support influxql raw query Mar 9, 2023
@aierui
Copy link

aierui commented Apr 4, 2023

This thing is the same as what 6112 they are doing.

@jiacai2050
Copy link
Contributor

Thanks for your tips.

We have forked influxql related logical(parser and planner) into an independent crate already.

@chunshao90 chunshao90 unpinned this issue May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants