Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Trino get_columns #29566

Merged
merged 2 commits into from
Jul 12, 2024
Merged

fix: Trino get_columns #29566

merged 2 commits into from
Jul 12, 2024

Conversation

betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented Jul 11, 2024

SUMMARY

This PR is a workaround to a weird bug we've encountered in Trino when adding a physical dataset.

When adding the dataset, Superset calls get_columns on the inspector. The Trino driver then does a check to see if the table exists, using has_table:

https://github.com/trinodb/trino-python-client/blob/505989a59e93f6319e8b6be89a16f81034a06404/trino/sqlalchemy/dialect.py#L176

The has_table method runs this query:

            SELECT "table_name"
            FROM "information_schema"."tables"
            WHERE "table_schema" = :schema
              AND "table_name" = :table

And here's the weird part: even though the table DOES exist, the query can return nothing. Even weirder: running a SHOW COLUMNS FROM :schema.:table in that case returns data.

The Trino DB engine spec already has a custom get_indexes method with the warning:

Trino dialect raises NoSuchTableError in get_indexes if table is empty.

@classmethod
def get_indexes(
cls,
database: Database,
inspector: Inspector,
table: Table,
) -> list[dict[str, Any]]:
"""
Get the indexes associated with the specified schema/table.
Trino dialect raises NoSuchTableError in get_indexes if table is empty.
:param database: The database to inspect
:param inspector: The SQLAlchemy inspector
:param table: The table instance to inspect
:returns: The indexes
"""
try:
return super().get_indexes(database, inspector, table)
except NoSuchTableError:
return []

So I suspect it's the same problem.

To fix this, I modified the Trino DB engine spec to first try the inspector.get_columns method, which queries the information_schema.table. If this fails with NoSuchTableError then it falls back to the SHOW COLUMNS FROM ... approach.

The Presto DB engine spec already implements a get_columns that uses SHOW COLUMNS FROM ..., so I moved it to PrestoBaseEngineSpec, so it can be used both by PrestoEngineSpec and TrinoEngineSpec.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@dosubot dosubot bot added data:connect:presto Related to Presto data:connect:trino Related to Trino labels Jul 11, 2024
@@ -840,211 +1043,6 @@ def get_view_names(
results = cursor.fetchall()
return {row[0] for row in results}

@classmethod
def _create_column_info(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods were moved to the base class, PrestoBaseEngineSpec.

@@ -2224,6 +2217,23 @@ def denormalize_name(cls, dialect: Dialect, name: str) -> str:

return name

@classmethod
def quote_table(cls, table: Table, dialect: Dialect) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -433,7 +437,17 @@ def get_columns(
"schema_options", expand the schema definition out to show all
subfields of nested ROWs as their appropriate dotted paths.
"""
base_cols = super().get_columns(inspector, table, options)
# The Trino dialect raises `NoSuchTableError` on the inspection methods when the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the comments 👍🏼

@betodealmeida betodealmeida merged commit fa095a9 into master Jul 12, 2024
33 checks passed
eschutho pushed a commit that referenced this pull request Jul 24, 2024
@rusackas rusackas deleted the trino-wrappers branch September 27, 2024 20:54
@github-actions github-actions bot added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 4.1.0 labels Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels data:connect:presto Related to Presto data:connect:trino Related to Trino preset-io size/XL 🚢 4.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants