-
Notifications
You must be signed in to change notification settings - Fork 16.6k
feat(db_engine_specs): add Apache Phoenix and Apache IoTDB engine specs #37590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Add engine specs for two ASF database projects: - Apache Phoenix: SQL layer over Apache HBase, using phoenixdb driver (phoenix:// dialect, default port 8765) - Apache IoTDB: Time series database for IoT data, using apache-iotdb driver (iotdb:// dialect, default port 6667) Both include metadata for docs generation, logos, and connection info. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
superset/db_engine_specs/phoenix.py
Outdated
| "CAST(DATE_TRUNC('SECOND', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" | ||
| ), | ||
| TimeGrain.MINUTE: ( | ||
| "CAST(DATE_TRUNC('MINUTE', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" | ||
| ), | ||
| TimeGrain.HOUR: ( | ||
| "CAST(DATE_TRUNC('HOUR', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" | ||
| ), | ||
| TimeGrain.DAY: "CAST(DATE_TRUNC('DAY', CAST({col} AS TIMESTAMP)) AS DATE)", | ||
| TimeGrain.WEEK: "CAST(DATE_TRUNC('WEEK', CAST({col} AS TIMESTAMP)) AS DATE)", | ||
| TimeGrain.MONTH: ( | ||
| "CAST(DATE_TRUNC('MONTH', CAST({col} AS TIMESTAMP)) AS DATE)" | ||
| ), | ||
| TimeGrain.QUARTER: ( | ||
| "CAST(DATE_TRUNC('QUARTER', CAST({col} AS TIMESTAMP)) AS DATE)" | ||
| ), | ||
| TimeGrain.YEAR: "CAST(DATE_TRUNC('YEAR', CAST({col} AS TIMESTAMP)) AS DATE)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: The time grain expressions are using the PostgreSQL-specific function DATE_TRUNC, which is not supported by Apache Phoenix, so any query that applies a time grain will generate invalid SQL and fail at runtime; use Phoenix's TRUNC function instead. [logic error]
Severity Level: Major ⚠️
- ❌ Time-grain grouping fails for Phoenix-backed charts.
- ❌ SQL Lab queries with time grains error on Phoenix.
- ⚠️ Affects users connecting to Phoenix databases.| "CAST(DATE_TRUNC('SECOND', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" | |
| ), | |
| TimeGrain.MINUTE: ( | |
| "CAST(DATE_TRUNC('MINUTE', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" | |
| ), | |
| TimeGrain.HOUR: ( | |
| "CAST(DATE_TRUNC('HOUR', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" | |
| ), | |
| TimeGrain.DAY: "CAST(DATE_TRUNC('DAY', CAST({col} AS TIMESTAMP)) AS DATE)", | |
| TimeGrain.WEEK: "CAST(DATE_TRUNC('WEEK', CAST({col} AS TIMESTAMP)) AS DATE)", | |
| TimeGrain.MONTH: ( | |
| "CAST(DATE_TRUNC('MONTH', CAST({col} AS TIMESTAMP)) AS DATE)" | |
| ), | |
| TimeGrain.QUARTER: ( | |
| "CAST(DATE_TRUNC('QUARTER', CAST({col} AS TIMESTAMP)) AS DATE)" | |
| ), | |
| TimeGrain.YEAR: "CAST(DATE_TRUNC('YEAR', CAST({col} AS TIMESTAMP)) AS DATE)", | |
| "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'SECOND') AS TIMESTAMP)" | |
| ), | |
| TimeGrain.MINUTE: ( | |
| "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'MINUTE') AS TIMESTAMP)" | |
| ), | |
| TimeGrain.HOUR: ( | |
| "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'HOUR') AS TIMESTAMP)" | |
| ), | |
| TimeGrain.DAY: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'DAY') AS DATE)", | |
| TimeGrain.WEEK: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'WEEK') AS DATE)", | |
| TimeGrain.MONTH: ( | |
| "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'MONTH') AS DATE)" | |
| ), | |
| TimeGrain.QUARTER: ( | |
| "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'QUARTER') AS DATE)" | |
| ), | |
| TimeGrain.YEAR: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'YEAR') AS DATE)", |
Steps of Reproduction ✅
1. Open a Python REPL in the repo and import the Phoenix spec:
- Run: python -c "from superset.db_engine_specs.phoenix import PhoenixEngineSpec;
print(PhoenixEngineSpec.engine_name)"
- This imports the class defined at `superset/db_engine_specs/phoenix.py:26` and loads
its attributes.
2. Inspect the time-grain templates defined on the class at
`superset/db_engine_specs/phoenix.py:54-74`:
- In the REPL evaluate:
```
from superset.db_engine_specs.phoenix import PhoenixEngineSpec
from superset.constants import TimeGrain
print(PhoenixEngineSpec._time_grain_expressions[TimeGrain.MINUTE])
```
- This returns the string containing "DATE_TRUNC(...)" located in the lines shown at
`phoenix.py:56,59,63,65-73`.
3. Use the produced template to build a query that groups by time grain (the same
templates are consumed by the query generator that uses engine specs):
- Substitute a timestamp column name, e.g.
PhoenixEngineSpec._time_grain_expressions[TimeGrain.HOUR].format(col='event_ts')
- Embed the resulting expression into a SELECT/GROUP BY and execute it against a
running Apache Phoenix instance (default port 8765 from `phoenix.py:46`) via SQL Lab or
a DB client.
4. Observe the Phoenix server returns a SQL error (function or syntax error) because the
generated SQL uses DATE_TRUNC (Postgres-style) instead of Phoenix-supported syntax:
- Error arises at query execution time in SQL Lab or when the driver (phoenixdb)
forwards the SQL to Phoenix.
Notes:
- The reproduction is concrete: the DATE_TRUNC strings exist in `phoenix.py:54-74`. Any
code path that renders time-grain expressions from PhoenixEngineSpec (the Superset query
generator consuming engine specs) will produce these invalid fragments.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/db_engine_specs/phoenix.py
**Line:** 57:73
**Comment:**
*Logic Error: The time grain expressions are using the PostgreSQL-specific function `DATE_TRUNC`, which is not supported by Apache Phoenix, so any query that applies a time grain will generate invalid SQL and fail at runtime; use Phoenix's `TRUNC` function instead.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review Agent Run #73cd5b
Actionable Suggestions - 1
-
superset/db_engine_specs/iotdb.py - 1
- Incorrect Connection String Format · Line 40-40
Additional Suggestions - 1
-
superset/db_engine_specs/phoenix.py - 1
-
Missing future annotations import · Line 17-17The file lacks the 'from __future__ import annotations' import, which is present in most other engine spec files and required for proper type hinting in new code per project guidelines. Adding this ensures consistency and forward compatibility.
Code suggestion
@@ -16,1 +16,2 @@ - # under the License. + # under the License. + from __future__ import annotations
-
Review Details
-
Files reviewed - 2 · Commit Range:
1b2b5ee..fdd2ef6- superset/db_engine_specs/iotdb.py
- superset/db_engine_specs/phoenix.py
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
superset/db_engine_specs/iotdb.py
Outdated
| DatabaseCategory.OPEN_SOURCE, | ||
| ], | ||
| "pypi_packages": ["apache-iotdb"], | ||
| "connection_string": ("iotdb://{username}:{password}@{hostname}:{port}/"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The connection_string includes an incorrect trailing slash, which does not align with the documented format for apache-iotdb (e.g., iotdb://user:pass@host:port). This could prevent successful connections.
Code suggestion
Check the AI-generated fix before applying
| "connection_string": ("iotdb://{username}:{password}@{hostname}:{port}/"), | |
| "connection_string": ("iotdb://{username}:{password}@{hostname}:{port}"), |
Code Review Run #73cd5b
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
- Use Phoenix's TRUNC() instead of PostgreSQL's DATE_TRUNC() - Add `from __future__ import annotations` to both files - Remove trailing slash from IoTDB connection string Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes pre-existing check-python-deps CI failure. The pinned version 2.9.6 no longer matches what uv resolves on CI runners. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Note on |
Code Review Agent Run #be99b4Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
SUMMARY
Add engine specs for two Apache Software Foundation database projects that Superset can connect to:
phoenixdbdriver withphoenix://dialect on default port 8765.apache-iotdbdriver withiotdb://dialect on default port 6667.Both specs include metadata for documentation generation, project logos, connection string templates, and appropriate categorization under
APACHE_PROJECTS.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — metadata-only changes that appear in database documentation.
TESTING INSTRUCTIONS
python -c "from superset.db_engine_specs.phoenix import PhoenixEngineSpec; print(PhoenixEngineSpec.engine_name)"python -c "from superset.db_engine_specs.iotdb import IoTDBEngineSpec; print(IoTDBEngineSpec.engine_name)"cd docs && yarn lint:db-metadatadocs/static/img/databases/apache-phoenix.pnganddocs/static/img/databases/apache-iotdb.svgADDITIONAL INFORMATION
🤖 Generated with Claude Code