Closed
Description
There is know issue of using dropbox/PyHive to connect Kyuubi/Spark.
A forked acryldata/PyHive fixed the table name issue, when I try to use this one to connect Kyuubi/Spark Thrift Server, I see duplicated table names in the list.
How to reproduce the bug
Follow the docker README to launch apache/superset:1.5.0 in local.
Replace dropbox/PyHive by acryldata/PyHive
docker exec -it -u root superset pip uninstall PyHive
docker exec -it -u root superset pip install acryl-PyHive
Start a Spark Thrift Server. (Got same result when connect to Kyuubi)
Create database connection in Superset
Refresh table list.
Expected results
No duplicated table name in list.
Actual results
Environment
- browser type and version: Latest Edge
- superset version:
superset version
: official docker image, apache/superset:1.5.0 - python version:
python --version
- node.js version:
node -v
- any feature flags active:
Checklist
Make sure to follow these steps before submitting your issue - thank you!
- I have checked the superset logs for python stacktraces and included it here as text if there are any.
- I have reproduced the issue with at least the latest released version of superset.
- I have checked the issue tracker for the same issue and I haven't found one similar.
Additional context
When click refresh table list button, I see the Spark Thrift Server receives two SHOW TABLES
queries in log.
22/05/24 15:00:10 INFO SparkExecuteStatementOperation: Submitting query 'SHOW TABLES IN `tpcds_s1`' with 3e98efb4-780f-43db-afb1-1d204f1009cc
22/05/24 15:00:10 INFO SparkExecuteStatementOperation: Running query with 3e98efb4-780f-43db-afb1-1d204f1009cc
...
22/05/24 15:00:10 INFO SparkExecuteStatementOperation: Submitting query 'SHOW TABLES IN `tpcds_s1`' with cf3fdb93-cad5-43c8-977e-aa18f44068fa
22/05/24 15:00:10 INFO SparkExecuteStatementOperation: Running query with cf3fdb93-cad5-43c8-977e-aa18f44068fa