Skip to content

Comments

fix(result_set): preserve JSON/JSONB data as objects instead of strings#38172

Open
rusackas wants to merge 5 commits intomasterfrom
fix-25125-json-data-type-preservation
Open

fix(result_set): preserve JSON/JSONB data as objects instead of strings#38172
rusackas wants to merge 5 commits intomasterfrom
fix-25125-json-data-type-preservation

Conversation

@rusackas
Copy link
Member

SUMMARY

This PR fixes issue #25125 where JSON/JSONB data from databases (like PostgreSQL) was being converted to strings instead of being preserved as Python objects. This broke features like Handlebars templates that need to access JSON data as objects (e.g., {{this.json.key}}).

Root Cause:
When processing query results, PyArrow detects JSON/JSONB columns as "nested types" and the code was stringifying them for compatibility. However, this meant the JSON objects became strings by the time they reached the frontend.

Solution:
The fix works by:

  1. Tracking columns with nested/JSON data before stringification (for PyArrow compatibility)
  2. Restoring the original Python objects (dicts/lists) when converting to pandas DataFrame

This approach:

  • Maintains backward compatibility with the PyArrow table structure
  • Preserves JSON objects as Python dicts/lists in the final DataFrame
  • Works with heterogeneous JSON structures (different keys per row)
  • Handles null values correctly

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Not applicable - this is a backend data serialization fix.

Before:

{"source_id": "1", "json": "{'key': 'value'}"}

After:

{"source_id": "1", "json": {"key": "value"}}

TESTING INSTRUCTIONS

  1. Create a PostgreSQL table with a JSONB column:

    CREATE TABLE test_json (
      id SERIAL PRIMARY KEY,
      data JSONB
    );
    INSERT INTO test_json (data) VALUES 
      ('{"key": "value1", "nested": {"a": 1}}'),
      ('{"key": "value2", "items": [1, 2, 3]}');
  2. Create a Handlebars chart using this dataset

  3. In the Handlebars template, try accessing nested JSON properties:

    {{#each data}}
      Key: {{this.data.key}}, Nested A: {{this.data.nested.a}}
    {{/each}}
  4. Verify that the JSON properties are accessible as objects, not strings

Unit Tests:
Run the new tests:

pytest tests/unit_tests/result_set_test.py::test_json_data_type_preserved_as_objects -v
pytest tests/unit_tests/result_set_test.py::test_json_data_with_homogeneous_structure -v
pytest tests/unit_tests/result_set_test.py::test_array_data_type_preserved -v

ADDITIONAL INFORMATION

@dosubot
Copy link

dosubot bot commented Feb 22, 2026

Related Documentation

Checked 0 published document(s) in 2 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Feb 22, 2026

Code Review Agent Run #01973b

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: ffc31f0..ffc31f0
    • superset/result_set.py
    • tests/unit_tests/result_set_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@dosubot dosubot bot added the data Namespace | Anything related to data, including databases configurations, datasets, etc. label Feb 22, 2026
@bito-code-review
Copy link
Contributor

bito-code-review bot commented Feb 23, 2026

Code Review Agent Run #99e9fd

Actionable Suggestions - 0
Review Details
  • Files reviewed - 4 · Commit Range: ffc31f0..6cce701
    • superset/dataframe.py
    • superset/result_set.py
    • tests/integration_tests/result_set_tests.py
    • tests/unit_tests/result_set_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Feb 23, 2026

Code Review Agent Run #8708e7

Actionable Suggestions - 0
Review Details
  • Files reviewed - 4 · Commit Range: 6cce701..7eeff56
    • superset/dataframe.py
    • superset/result_set.py
    • tests/integration_tests/result_set_tests.py
    • tests/unit_tests/result_set_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Copy link
Member

@msyavuz msyavuz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would serializing-deserializing when writing/reading these columns be easier?

rusackas and others added 5 commits February 23, 2026 16:27
This fix ensures that JSON and JSONB data from databases (like PostgreSQL)
is preserved as Python objects (dicts/lists) when converting result sets
to pandas DataFrames. Previously, nested data types were being stringified,
which broke features like Handlebars templates that need to access JSON
data as objects rather than strings.

The fix works by:
1. Tracking columns with nested/JSON data before stringification
2. Restoring the original Python objects when converting to pandas

Fixes #25125

Co-Authored-By: Claude <noreply@anthropic.com>
pd.isna() raises ValueError when called on arrays (lists/dicts from JSON).
Use a helper function that catches this exception and returns False for
array values.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update test expectations to expect JSON data as preserved objects
(dicts/lists) instead of stringified JSON, matching the new behavior.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When heterogeneous data (e.g., [123456, "foo"]) causes PyArrow to throw
ArrowInvalid, the except branch stringifies the data before the second
loop can detect nested types via pa.types.is_nested(). This means
columns with nested data (lists/dicts) never get added to
_nested_columns and their JSON structure is lost.

Fix by checking the original data for nested types (lists/dicts) in the
except branch before stringifying, preserving them in _nested_columns
so they are restored as Python objects in to_pandas_df().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rusackas rusackas force-pushed the fix-25125-json-data-type-preservation branch from 7eeff56 to e867d53 Compare February 24, 2026 00:31
@bito-code-review
Copy link
Contributor

bito-code-review bot commented Feb 24, 2026

Code Review Agent Run #ab00e1

Actionable Suggestions - 0
Review Details
  • Files reviewed - 4 · Commit Range: 884681f..e867d53
    • superset/dataframe.py
    • superset/result_set.py
    • tests/integration_tests/result_set_tests.py
    • tests/unit_tests/result_set_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Namespace | Anything related to data, including databases configurations, datasets, etc. size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DataType JSON auto convert to String

2 participants