Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: datatype tracking issue on virtual dataset #20088

Conversation

codemaster08240328
Copy link
Contributor

SUMMARY

Column types not detected when creating Virtual Dataset.

If we create a virtual dataset, and try to create a chart based on it, some column types may not be detected as it was defined in original dataset. It was because some columns had no any non-null values when we run the SQL to create a dataset.
The previous version was tracking column type of columns from the query result values, and that's why some columns had null type even though it had exact type in original dataset.

To fix this, we need to track the column type from original dataset, not a SQL query running result value.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

BEFORE
image

AFTER:
Screen Shot 2022-05-16 at 16 01 45
Screen Shot 2022-05-16 at 16 03 10

TESTING INSTRUCTIONS

  1. Create a virtual dataset using this query:
SELECT *
FROM public."SQLLab_Test_For_VirtualDS"
WHERE site_slug in ('connectmiles', 'rocketmiles')
AND reward_program_slug in ('connectmiles')
LIMIT 10
  1. Try to create a chart based on virtual dataset.
  2. See if all columns have exact data type, not NULL.

Dataset for virtual dataset.
sqllab_query_publicconnectmiles_all_reservation_stats_20220412T210849.csv

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented May 16, 2022

Codecov Report

Merging #20088 (fa4f374) into master (c8fe518) will decrease coverage by 11.91%.
The diff coverage is 37.50%.

@@             Coverage Diff             @@
##           master   #20088       +/-   ##
===========================================
- Coverage   66.47%   54.55%   -11.92%     
===========================================
  Files        1727     1727               
  Lines       64724    64732        +8     
  Branches     6822     6822               
===========================================
- Hits        43024    35314     -7710     
- Misses      19969    27687     +7718     
  Partials     1731     1731               
Flag Coverage Δ
hive 53.69% <37.50%> (-0.01%) ⬇️
mysql ?
postgres ?
presto 53.55% <37.50%> (-0.01%) ⬇️
python 57.98% <37.50%> (-24.66%) ⬇️
sqlite ?
unit 49.45% <37.50%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/db_engine_specs/postgres.py 59.32% <37.50%> (-37.96%) ⬇️
superset/utils/dashboard_import_export.py 0.00% <0.00%> (-100.00%) ⬇️
superset/key_value/commands/upsert.py 0.00% <0.00%> (-89.59%) ⬇️
superset/key_value/commands/update.py 0.00% <0.00%> (-89.37%) ⬇️
superset/key_value/commands/delete.py 0.00% <0.00%> (-85.30%) ⬇️
superset/key_value/commands/delete_expired.py 0.00% <0.00%> (-80.77%) ⬇️
superset/dashboards/commands/importers/v0.py 15.62% <0.00%> (-76.25%) ⬇️
superset/datasets/commands/update.py 25.88% <0.00%> (-68.24%) ⬇️
superset/datasets/commands/create.py 30.18% <0.00%> (-67.93%) ⬇️
superset/datasets/commands/importers/v0.py 24.03% <0.00%> (-67.45%) ⬇️
... and 272 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c8fe518...fa4f374. Read the comment docs.

Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! :)

@betodealmeida
Copy link
Member

Looks like there's a test failing.

@rusackas rusackas merged commit 74c5479 into apache:master Jun 1, 2022
@@ -21,6 +21,7 @@
from typing import Any, Dict, List, Optional, Pattern, Tuple, TYPE_CHECKING

from flask_babel import gettext as __
from psycopg2.extensions import binary_types, string_types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this import should be moved into get_datatype, as psycopg2 isn't a required dependency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by #20543

philipher29 pushed a commit to ValtechMobility/superset that referenced this pull request Jun 9, 2022
* Fix datatype tracking issue on virtual dataset

* fix pytest issue on postgresql
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.0.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels Preset-Patch size/S 🚢 2.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants