Skip to content

Conversation

@rusackas
Copy link
Member

SUMMARY

Add engine specs for two Apache Software Foundation database projects that Superset can connect to:

  • Apache Phoenix — A SQL query engine for Apache HBase, providing low-latency SQL queries over HBase data. Uses the phoenixdb driver with phoenix:// dialect on default port 8765.
  • Apache IoTDB — A time series database designed for IoT data with efficient storage and query capabilities. Uses the apache-iotdb driver with iotdb:// dialect on default port 6667.

Both specs include metadata for documentation generation, project logos, connection string templates, and appropriate categorization under APACHE_PROJECTS.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A — metadata-only changes that appear in database documentation.

TESTING INSTRUCTIONS

  1. Verify the new engine specs are discovered: python -c "from superset.db_engine_specs.phoenix import PhoenixEngineSpec; print(PhoenixEngineSpec.engine_name)"
  2. Verify IoTDB: python -c "from superset.db_engine_specs.iotdb import IoTDBEngineSpec; print(IoTDBEngineSpec.engine_name)"
  3. Run the metadata lint: cd docs && yarn lint:db-metadata
  4. Check that logos exist at docs/static/img/databases/apache-phoenix.png and docs/static/img/databases/apache-iotdb.svg

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

🤖 Generated with Claude Code

Add engine specs for two ASF database projects:

- Apache Phoenix: SQL layer over Apache HBase, using phoenixdb driver
  (phoenix:// dialect, default port 8765)
- Apache IoTDB: Time series database for IoT data, using apache-iotdb
  driver (iotdb:// dialect, default port 6667)

Both include metadata for docs generation, logos, and connection info.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions github-actions bot added the doc Namespace | Anything related to documentation label Jan 31, 2026
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dosubot dosubot bot added Apache Software Foundation Related to Apache Software Foundation data:connect Namespace | Anything related to db connections / integrations labels Jan 31, 2026
Comment on lines 57 to 73
"CAST(DATE_TRUNC('SECOND', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)"
),
TimeGrain.MINUTE: (
"CAST(DATE_TRUNC('MINUTE', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)"
),
TimeGrain.HOUR: (
"CAST(DATE_TRUNC('HOUR', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)"
),
TimeGrain.DAY: "CAST(DATE_TRUNC('DAY', CAST({col} AS TIMESTAMP)) AS DATE)",
TimeGrain.WEEK: "CAST(DATE_TRUNC('WEEK', CAST({col} AS TIMESTAMP)) AS DATE)",
TimeGrain.MONTH: (
"CAST(DATE_TRUNC('MONTH', CAST({col} AS TIMESTAMP)) AS DATE)"
),
TimeGrain.QUARTER: (
"CAST(DATE_TRUNC('QUARTER', CAST({col} AS TIMESTAMP)) AS DATE)"
),
TimeGrain.YEAR: "CAST(DATE_TRUNC('YEAR', CAST({col} AS TIMESTAMP)) AS DATE)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The time grain expressions are using the PostgreSQL-specific function DATE_TRUNC, which is not supported by Apache Phoenix, so any query that applies a time grain will generate invalid SQL and fail at runtime; use Phoenix's TRUNC function instead. [logic error]

Severity Level: Major ⚠️
- ❌ Time-grain grouping fails for Phoenix-backed charts.
- ❌ SQL Lab queries with time grains error on Phoenix.
- ⚠️ Affects users connecting to Phoenix databases.
Suggested change
"CAST(DATE_TRUNC('SECOND', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)"
),
TimeGrain.MINUTE: (
"CAST(DATE_TRUNC('MINUTE', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)"
),
TimeGrain.HOUR: (
"CAST(DATE_TRUNC('HOUR', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)"
),
TimeGrain.DAY: "CAST(DATE_TRUNC('DAY', CAST({col} AS TIMESTAMP)) AS DATE)",
TimeGrain.WEEK: "CAST(DATE_TRUNC('WEEK', CAST({col} AS TIMESTAMP)) AS DATE)",
TimeGrain.MONTH: (
"CAST(DATE_TRUNC('MONTH', CAST({col} AS TIMESTAMP)) AS DATE)"
),
TimeGrain.QUARTER: (
"CAST(DATE_TRUNC('QUARTER', CAST({col} AS TIMESTAMP)) AS DATE)"
),
TimeGrain.YEAR: "CAST(DATE_TRUNC('YEAR', CAST({col} AS TIMESTAMP)) AS DATE)",
"CAST(TRUNC(CAST({col} AS TIMESTAMP), 'SECOND') AS TIMESTAMP)"
),
TimeGrain.MINUTE: (
"CAST(TRUNC(CAST({col} AS TIMESTAMP), 'MINUTE') AS TIMESTAMP)"
),
TimeGrain.HOUR: (
"CAST(TRUNC(CAST({col} AS TIMESTAMP), 'HOUR') AS TIMESTAMP)"
),
TimeGrain.DAY: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'DAY') AS DATE)",
TimeGrain.WEEK: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'WEEK') AS DATE)",
TimeGrain.MONTH: (
"CAST(TRUNC(CAST({col} AS TIMESTAMP), 'MONTH') AS DATE)"
),
TimeGrain.QUARTER: (
"CAST(TRUNC(CAST({col} AS TIMESTAMP), 'QUARTER') AS DATE)"
),
TimeGrain.YEAR: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'YEAR') AS DATE)",
Steps of Reproduction ✅
1. Open a Python REPL in the repo and import the Phoenix spec:

   - Run: python -c "from superset.db_engine_specs.phoenix import PhoenixEngineSpec;
   print(PhoenixEngineSpec.engine_name)"

   - This imports the class defined at `superset/db_engine_specs/phoenix.py:26` and loads
   its attributes.

2. Inspect the time-grain templates defined on the class at
`superset/db_engine_specs/phoenix.py:54-74`:

   - In the REPL evaluate:

     ```

     from superset.db_engine_specs.phoenix import PhoenixEngineSpec

     from superset.constants import TimeGrain

     print(PhoenixEngineSpec._time_grain_expressions[TimeGrain.MINUTE])

     ```

   - This returns the string containing "DATE_TRUNC(...)" located in the lines shown at
   `phoenix.py:56,59,63,65-73`.

3. Use the produced template to build a query that groups by time grain (the same
templates are consumed by the query generator that uses engine specs):

   - Substitute a timestamp column name, e.g.
   PhoenixEngineSpec._time_grain_expressions[TimeGrain.HOUR].format(col='event_ts')

   - Embed the resulting expression into a SELECT/GROUP BY and execute it against a
   running Apache Phoenix instance (default port 8765 from `phoenix.py:46`) via SQL Lab or
   a DB client.

4. Observe the Phoenix server returns a SQL error (function or syntax error) because the
generated SQL uses DATE_TRUNC (Postgres-style) instead of Phoenix-supported syntax:

   - Error arises at query execution time in SQL Lab or when the driver (phoenixdb)
   forwards the SQL to Phoenix.

Notes:

- The reproduction is concrete: the DATE_TRUNC strings exist in `phoenix.py:54-74`. Any
code path that renders time-grain expressions from PhoenixEngineSpec (the Superset query
generator consuming engine specs) will produce these invalid fragments.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/db_engine_specs/phoenix.py
**Line:** 57:73
**Comment:**
	*Logic Error: The time grain expressions are using the PostgreSQL-specific function `DATE_TRUNC`, which is not supported by Apache Phoenix, so any query that applies a time grain will generate invalid SQL and fail at runtime; use Phoenix's `TRUNC` function instead.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

Copy link
Contributor

@bito-code-review bito-code-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #73cd5b

Actionable Suggestions - 1
  • superset/db_engine_specs/iotdb.py - 1
    • Incorrect Connection String Format · Line 40-40
Additional Suggestions - 1
  • superset/db_engine_specs/phoenix.py - 1
    • Missing future annotations import · Line 17-17
      The file lacks the 'from __future__ import annotations' import, which is present in most other engine spec files and required for proper type hinting in new code per project guidelines. Adding this ensures consistency and forward compatibility.
      Code suggestion
       @@ -16,1 +16,2 @@
      - # under the License.
      + # under the License.
      + from __future__ import annotations
Review Details
  • Files reviewed - 2 · Commit Range: 1b2b5ee..fdd2ef6
    • superset/db_engine_specs/iotdb.py
    • superset/db_engine_specs/phoenix.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

DatabaseCategory.OPEN_SOURCE,
],
"pypi_packages": ["apache-iotdb"],
"connection_string": ("iotdb://{username}:{password}@{hostname}:{port}/"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect Connection String Format

The connection_string includes an incorrect trailing slash, which does not align with the documented format for apache-iotdb (e.g., iotdb://user:pass@host:port). This could prevent successful connections.

Code suggestion
Check the AI-generated fix before applying
Suggested change
"connection_string": ("iotdb://{username}:{password}@{hostname}:{port}/"),
"connection_string": ("iotdb://{username}:{password}@{hostname}:{port}"),

Code Review Run #73cd5b


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

@netlify
Copy link

netlify bot commented Jan 31, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit fdd2ef6
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/697e08d095ded70008e7d8e0
😎 Deploy Preview https://deploy-preview-37590--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

rusackas and others added 2 commits January 31, 2026 15:06
- Use Phoenix's TRUNC() instead of PostgreSQL's DATE_TRUNC()
- Add `from __future__ import annotations` to both files
- Remove trailing slash from IoTDB connection string

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes pre-existing check-python-deps CI failure. The pinned version
2.9.6 no longer matches what uv resolves on CI runners.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@rusackas
Copy link
Member Author

Note on requirements/development.txt change: This PR bumps the psycopg2-binary pin from 2.9.6 → 2.9.9. This is a pre-existing mismatch on master that was latent because the check-python-deps CI workflow only runs when Python files are modified — and recent master commits were all non-Python changes (GitHub Action bumps, frontend deps), so the check was being skipped. Adding .py files in this PR triggered the actual check, exposing the stale pin.

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Jan 31, 2026

Code Review Agent Run #be99b4

Actionable Suggestions - 0
Review Details
  • Files reviewed - 3 · Commit Range: 1b2b5ee..b662a1f
    • requirements/development.txt
    • superset/db_engine_specs/iotdb.py
    • superset/db_engine_specs/phoenix.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apache Software Foundation Related to Apache Software Foundation data:connect Namespace | Anything related to db connections / integrations doc Namespace | Anything related to documentation size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant