-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SIP-95] Proposal for Catalog Support in SQL Lab #22862
Comments
I would love to see this, but there are few complications. The biggest one is that we need support from the SQL Alchemy dialects, and I'm not sure how many support dynamic catalogs. For example, both Hive and Trino require the catalog to be specified when the connection is created:
For engines like this the user would have to choose a default catalog when the database is added, and each DB engine spec in Superset would have to know how to replace the catalog in the URL, in case a user chooses one different from the default after the DB has been created. It's definitely doable, and something we should do. We've been slowly adding support for catalogs: superset/superset/sql_parse.py Line 172 in 383313b
Are you planning to work on this, or are you just proposing it? If you're not going to work on it I can start taking a look. |
I was planning to work on this, but of course I’d love help as I’ve never made a contribution to Superset. I was planning on doing a smaller task first (adding a configurable header/footer) since I figured this one might be complex. |
Ah, great! From the wording of the SIP I wasn't sure, and people are welcome to propose changes even if they're not going to work on it. I'm happy to help in any way you need, let me know! |
hey @betodealmeida, good to see you are still pushing Superset forward! We worked around originally by concatenating catalog with schema, but patching got nasty, as we would pass it thought to Trino driver. |
I think a good approach for this would be modifying BaseEngineSpec:
@classmethod
def get_engine(
cls,
database: Database,
catalog: Optional[str] = None,
schema: Optional[str] = None,
source: Optional[QuerySource] = None,
) -> ContextManager[Engine]:
# get_sqla_engine_with_context needs to call database.engine_spec.apply_catalog_to_sqlalchemy_uri
return database.get_sqla_engine_with_context(
catalog=catalog,
schema=schema,
source=source
)
@classmethod
def apply_catalog_to_sqlalchemy_uri(cls, database: Database, catalog: str) -> URL:
return database.sqlalchemy_uri
@classmethod
def get_catalog_names(
cls,
database: Database,
inspect: Inspector,
) -> Set[str]:
return set() Individual DB engine specs like Trino/Hive/Snowflake/BigQuery would implement custom |
Actually, we would need the @classmethod
def apply_catalog_to_sqlalchemy_uri(
cls,
database: Database,
catalog: Optional[str],
schema: Optional[str],
) -> URL:
try:
connect_args = database.get_extra()["engine_params"]["connect_args"]
except KeyError:
connect_args = {}
return database.sqlalchemy_uri, connect_args For Postgres, this would look like this (untested): @classmethod
def apply_catalog_to_sqlalchemy_uri(
cls,
database: Database,
catalog: Optional[str],
schema: Optional[str],
) -> Tuple[URL, Dict[str, Any]]:
url, connect_args = super().apply_catalog_to_sqlalchemy_uri(database, catalog, schema)
if catalog:
# for Postgres a catalog is a database
url = url.set(database=catalog)
if schema:
connect_args["options"] = f"-csearch_path={schema}"
return url, connect_args See the discussion at #23356. |
Actually we already have a method to change the schema ( |
While working on an issue with schemas, I've done some foundational work for supporting catalogs:
Once we have that, I think we would still need the following steps:
|
I'm going to work on (1) and (2) this week, then we can put the SIP to vote and you (or someone else) can work on the UI part, @JELGT2011. |
that's awesome news @betodealmeida! I've been a little delayed trying to upgrade our existing instance of superset, but I could definitely do some frontend changes (3), (4?), and (5). I think I have a draft work in progress on it already so that works out. |
Numbering this as SIP-95. If you'd like to continue pursuing this, pleas submit it for a DISCUSS thread on the Superset @dev mailing list, so we can start progressing it toward a vote. Thanks! |
It seems like this work is incrementally moving forward, but the SIP itself has still not been brought up for a DISCUSS thread or a VOTE thread on the Apache mailing list. @JELGT2011 / @betodealmeida is there any need or intent to continue with the SIP process here? |
It would be huge to get this going for Databricks. I could probably contribute - especially since part of the approach was laid out for different engines |
I also support putting this SIP-95 up for a DISCUSS/VOTE thread! @motherduckdb has a similar scenario where a single DuckDB instance can connect to multiple databases (see https://duckdb.org/docs/sql/statements/attach.html). It would be great to be able to use the Catalog drop-down to scope the Schema options. |
@guenp please feel free to kick off the Discussion thread on the dev@superset.apache.org mailing list! Let me know if you need help with that, you can ping me here or on Superset Slack |
Hi @JELGT2011 / @guenp if anyone wants to move this SIP forward, please put it up for a [DISCUSS] thread on the mailing list. Otherwise, it's at risk of being closed/discarded. |
I'll bring it up for a discussion, I don't think we're far away from finishing this. |
@betodealmeida regarding your comment in the "[DISCUSS] [SIP-95] Proposal for Catalog Support in SQL Lab" email,
I totally agree as it likely is a necessary requirement if one wanted to explore a SQL Lab result set. |
@john-bodley right, I just called that out because the original SIP mentioned only SQL Lab in the title, but I don't see how that would work. 🙃 |
@betodealmeida I think you just need to close the vote for this with a [RESULT] thread |
Done! |
It's official, we now have catalog support in Apache Superset. Currently Postgres and Databricks are supported, and I have a PR out for:
Adding support to new DB engine specs should be straightforward, requiring just 3 methods. I suggest using #28416 as a reference implementation. |
[SIP] Proposal for Catalog Support in SQL Lab
Motivation
The current SQL Lab explorer is not very intuitive as there is no way to explore catalogs, and can only explore the
default
catalog.Proposed Change
Add a new
Catalog
dropdown in the sqllab editor (similar to schema).New or Changed Public Interfaces
This should only affect the sqllab query tab. The
Catalog
dropdown will be similar enough to theSchema
explorer. The wordschema
is kind of overloaded throughout Superset, so it's not entirely clear to me yet whether a new endpoint is necessary.New dependencies
None
Migration Plan and Compatibility
None
Rejected Alternatives
None
The text was updated successfully, but these errors were encountered: