Skip to content

Conversation

@Arunodoy18
Copy link

SUMMARY

Fixes failures caused by SQL queries that return unnamed result columns
(e.g. SELECT COUNT(*) FROM my_table), which previously caused errors in
SQL Lab, alerts, and dimension creation.

Superset assumed all result columns had explicit names. Some databases
(including MSSQL) return unnamed columns for valid SQL expressions.
This change normalizes unnamed result columns at the shared result-set
handling layer by assigning stable, deterministic names, while preserving
explicitly named columns unchanged.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Not applicable.

TESTING INSTRUCTIONS

Automated testing:

  • Added tests validating that:
    • Unnamed expression columns are normalized to non-empty, stable names
    • Explicit column aliases (e.g. AS total) are preserved unchanged
    • Downstream consumers relying on column metadata do not fail

Manual verification (optional):

  1. Run a query like SELECT COUNT(*) FROM my_table in SQL Lab
  2. Verify the result renders correctly without errors
  3. Use the query in an alert or dimension and confirm no failure occurs

ADDITIONAL INFORMATION

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Jan 30, 2026

Code Review Agent Run #b2c24a

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 26ddc06..26ddc06
    • superset/result_set.py
    • tests/unit_tests/result_set_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@dosubot dosubot bot added the data Namespace | Anything related to data, including databases configurations, datasets, etc. label Jan 30, 2026
if value is None:
return f"_col{index}"

name = convert_to_string(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Decoding column names can raise UnicodeDecodeError when convert_to_string tries to decode non-UTF-8 bytes; catch decoding errors and fallback to a safe decode (errors='replace') or str conversion so the normalization step never raises. [possible bug]

Severity Level: Major ⚠️
- ❌ SupersetResultSet fails on non-UTF8 column names.
- ⚠️ Query display can error for weird driver encodings.
- ⚠️ Downstream callers may see exceptions during result parsing.
Suggested change
name = convert_to_string(value)
try:
name = convert_to_string(value)
except UnicodeDecodeError:
# Fallback for bytes that are not valid UTF-8: replace invalid bytes
if isinstance(value, (bytes, bytearray)):
name = value.decode("utf-8", errors="replace")
else:
name = str(value)
Steps of Reproduction ✅
1. Open `superset/result_set.py` and find `convert_to_string` at
`superset/result_set.py:89-99` and `normalize_column_name` at
`superset/result_set.py:102-124`.

2. Create a cursor description entry containing non-UTF8 bytes, e.g. `cursor_description =
[(b"\xff", None, None, None, None, None, None)]`.

3. Instantiate `SupersetResultSet` in a test or REPL:

   `SupersetResultSet(data=[], cursor_description=cursor_description,
   db_engine_spec=BaseEngineSpec)` (`__init__` begins at `superset/result_set.py:128`).

4. During name normalization the code calls `convert_to_string` (invoked from
`normalize_column_name` at `superset/result_set.py:118`), which executes the bytes decode
at `superset/result_set.py:96-97` and raises `UnicodeDecodeError` for invalid UTF-8 bytes,
causing `SupersetResultSet.__init__` to error and abort result construction.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/result_set.py
**Line:** 118:118
**Comment:**
	*Possible Bug: Decoding column names can raise UnicodeDecodeError when `convert_to_string` tries to decode non-UTF-8 bytes; catch decoding errors and fallback to a safe decode (errors='replace') or str conversion so the normalization step never raises.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

Comment on lines 146 to 151
column_names = dedup(
[convert_to_string(col[0]) for col in cursor_description]
[
normalize_column_name(col[0], idx)
for idx, col in enumerate(cursor_description)
]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Accessing col[0] without validating that col is a sequence (or non-None) can raise TypeError/IndexError if a cursor description entry is None or not indexable; iterate and safely extract the 0th element before calling normalize_column_name. [possible bug]

Severity Level: Major ⚠️
- ❌ SupersetResultSet construction fails on malformed cursor descriptions.
- ⚠️ Query rendering flows may error when cursor.description malformed.
- ⚠️ Alerts/dimension creation pipelines using result set fail.
Suggested change
column_names = dedup(
[convert_to_string(col[0]) for col in cursor_description]
[
normalize_column_name(col[0], idx)
for idx, col in enumerate(cursor_description)
]
)
column_names_values: list[str] = []
for idx, col in enumerate(cursor_description):
try:
name_value = col[0]
except Exception:
# description entry missing or not indexable; treat as unnamed
name_value = None
column_names_values.append(normalize_column_name(name_value, idx))
column_names = dedup(column_names_values)
Steps of Reproduction ✅
1. In a Python REPL or unit test import the class: open `superset/result_set.py` and
locate `SupersetResultSet.__init__` at `superset/result_set.py:128` (constructor start).

2. Construct a malformed cursor description and invoke the constructor directly:

   - Create `cursor_description = [None]` or `cursor_description = [123]` in the test.

   - Call `SupersetResultSet(data=[], cursor_description=cursor_description,
   db_engine_spec=BaseEngineSpec)` (the call site is the `__init__` at
   `superset/result_set.py:128`).

3. When the list comprehension at `superset/result_set.py:146-150` executes, it tries to
evaluate `col[0]` (the call to `normalize_column_name(col[0], idx)` is at
`superset/result_set.py:148`), which raises:

   - `TypeError: 'NoneType' object is not subscriptable` for `None`, or

   - `TypeError: 'int' object is not subscriptable` for `123`.

4. Observe the exception bubbles out of `SupersetResultSet.__init__` and prevents
result-set construction; this reproduces the failure caused by unguarded `col[0]` access.

   - Note: This is reproducible without other parts of Superset by directly calling the
   constructor described above.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/result_set.py
**Line:** 146:151
**Comment:**
	*Possible Bug: Accessing `col[0]` without validating that `col` is a sequence (or non-None) can raise TypeError/IndexError if a cursor description entry is None or not indexable; iterate and safely extract the 0th element before calling `normalize_column_name`.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

@netlify
Copy link

netlify bot commented Jan 30, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 26ddc06
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/697c6ec92424b90008e001f8
😎 Deploy Preview https://deploy-preview-37570--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Namespace | Anything related to data, including databases configurations, datasets, etc. size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Creation of a dimension using SQL Server data warehouse fails with 'no field of name'

1 participant