fix(csv): respect CSV_EXPORT config for decimal separator and delimiter#38170
fix(csv): respect CSV_EXPORT config for decimal separator and delimiter#38170
Conversation
This commit fixes GitHub issue #32371 where the CSV_EXPORT configuration (specifically the `decimal` and `sep` parameters) was not being applied correctly in certain CSV export scenarios. The issue had two parts: 1. Custom decimal separator (e.g., `decimal: ","`) was ignored in exports 2. Pivoted CSV exports triggered errors when using custom CSV_EXPORT settings Changes: - superset/charts/client_processing.py: Pass CSV_EXPORT config to to_csv() when exporting processed (pivoted) CSV data in apply_client_processing() - superset/commands/streaming_export/base.py: Add support for CSV_EXPORT config in streaming exports by: - Using the `sep` parameter as the CSV delimiter - Formatting float values with the custom `decimal` separator The fix ensures that CSV exports consistently respect the CSV_EXPORT configuration across all export paths (regular, pivoted, and streaming). Fixes #32371 Co-Authored-By: Claude <noreply@anthropic.com>
Code Review Agent Run #92bbe4Actionable Suggestions - 0Additional Suggestions - 2
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Sequence DiagramThis PR ensures CSV_EXPORT settings (sep and decimal) are applied for both client-side (pivot/processed) CSV generation and server-side streaming exports. The diagram shows the two main success paths where the app config is read and applied to produce correctly formatted CSV output. sequenceDiagram
participant User
participant Frontend
participant Backend
participant AppConfig
participant Database
User->>Frontend: Request CSV export (regular or pivot)
Frontend->>Backend: Export request
Backend->>AppConfig: Read CSV_EXPORT (sep, decimal)
alt Client/processed CSV (apply_client_processing)
Backend->>Backend: processed_df.to_csv(..., **CSV_EXPORT)
Backend-->>Frontend: CSV data (respecting sep & decimal)
else Streaming SQL export
Backend->>AppConfig: Read CSV_EXPORT (sep, decimal)
Backend->>Database: Execute streaming query
Database-->>Backend: Rows stream
Backend->>Backend: csv.writer(delimiter=sep) & format floats (decimal)
Backend-->>Frontend: Streamed CSV chunks (respecting sep & decimal)
Generated by CodeAnt AI |
| formatted = [] | ||
| for value in row: | ||
| if isinstance(value, float): | ||
| # Format float with custom decimal separator |
There was a problem hiding this comment.
Suggestion: The decimal separator configuration is only applied to values of type float, so if the database driver returns numeric columns as other numeric types like decimal.Decimal (which is common for NUMERIC/DECIMAL columns), those values will bypass _format_row_values and still use a dot as decimal separator, causing streaming CSV exports to ignore the configured decimal separator for many numeric columns; broaden the type check to cover generic numeric types so all numeric values respect the decimal setting. [logic error]
Severity Level: Major ⚠️
- ❌ Streaming CSV from SQL Lab misformats DECIMAL numeric columns.
- ❌ Chart data streaming export misformats DECIMAL metrics with comma.
- ⚠️ Locale-specific tools misparse numbers due to dot decimal.| formatted = [] | |
| for value in row: | |
| if isinstance(value, float): | |
| # Format float with custom decimal separator | |
| from numbers import Real | |
| formatted: list[Any] = [] | |
| for value in row: | |
| if isinstance(value, Real): | |
| # Format numeric values with custom decimal separator |
Steps of Reproduction ✅
1. Configure Superset to use a custom decimal separator by setting `CSV_EXPORT =
{"encoding": "utf-8", "sep": ";", "decimal": ","}` in `superset_config.py` (this
configuration is read in `superset/commands/streaming_export/base.py:195-197` when
`_execute_query_and_stream()` obtains `decimal_separator`).
2. In a supported database where SQLAlchemy maps NUMERIC/DECIMAL columns to
`decimal.Decimal` (standard behavior for many engines), create or use a table with a
DECIMAL/NUMERIC column and insert a value like `1234.56`; query it either via a chart or
SQL Lab so that Superset executes a SELECT returning that DECIMAL column.
3. Trigger a streaming CSV export for that query: for charts, the export path uses
`StreamingCSVExportCommand` from
`superset/commands/chart/data/streaming_export_command.py:29` which subclasses
`BaseStreamingCSVExportCommand`; for SQL Lab, the API in `superset/sqllab/api.py:405`
constructs `StreamingSqlResultExportCommand(client_id, chunk_size)` (per Grep output)
which also subclasses `BaseStreamingCSVExportCommand`.
4. During the export, `BaseStreamingCSVExportCommand._execute_query_and_stream()` in
`superset/commands/streaming_export/base.py:187-243` streams rows; each batch is processed
by `_process_rows()` at lines 135-185, which calls `_format_row_values(row,
decimal_separator)` at lines 164-165 with `decimal_separator=","`. Because the current
implementation at lines 110-133 only checks `isinstance(value, float)`, any
`decimal.Decimal` values in the row are treated as non-floats and appended unchanged, so
`csv.writer` writes strings like `"1234.56"` with a dot instead of `"1234,56"`, ignoring
the configured decimal separator for these numeric columns.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/commands/streaming_export/base.py
**Line:** 126:129
**Comment:**
*Logic Error: The decimal separator configuration is only applied to values of type `float`, so if the database driver returns numeric columns as other numeric types like `decimal.Decimal` (which is common for NUMERIC/DECIMAL columns), those values will bypass `_format_row_values` and still use a dot as decimal separator, causing streaming CSV exports to ignore the configured decimal separator for many numeric columns; broaden the type check to cover generic numeric types so all numeric values respect the `decimal` setting.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| assert "name;value" in lines[0] | ||
| # With decimal=",", decimal values should use comma as separator | ||
| assert "Alice;1,5" in lines[1] or "Alice;1.5" in lines[1] | ||
| assert "Bob;2,75" in lines[2] or "Bob;2.75" in lines[2] |
There was a problem hiding this comment.
Suggestion: The regression test for custom CSV separator and decimal allows both comma and dot decimals, so it will still pass even if the CSV_EXPORT decimal setting is ignored, failing to detect regressions in numeric formatting. [logic error]
Severity Level: Major ⚠️
- ⚠️ Unit test for CSV decimal formatting may silently pass.
- ⚠️ Table CSV exports with custom decimal can regress unnoticed.| assert "name;value" in lines[0] | |
| # With decimal=",", decimal values should use comma as separator | |
| assert "Alice;1,5" in lines[1] or "Alice;1.5" in lines[1] | |
| assert "Bob;2,75" in lines[2] or "Bob;2.75" in lines[2] | |
| assert lines[0] == "name;value" | |
| # With decimal=",", decimal values should use comma as separator | |
| assert "Alice;1,5" in lines[1] | |
| assert "Bob;2,75" in lines[2] |
Steps of Reproduction ✅
1. Open `test_apply_client_processing_csv_format_custom_separator()` in
`tests/unit_tests/charts/test_client_processing.py:2793-2827` (added in this PR) which
uses `@with_config({"CSV_EXPORT": {"sep": ";", "decimal": ","}})` and calls
`apply_client_processing()`.
2. Note that `apply_client_processing()` in `superset/charts/client_processing.py:312-398`
formats CSV exports via `processed_df.to_csv(...)` using `csv_export_config =
current_app.config.get("CSV_EXPORT", {})` at lines 391-395.
3. Introduce a regression by editing `processed_df.to_csv(buf, index=show_default_index,
**csv_export_config)` at `superset/charts/client_processing.py:393-394` so that it no
longer passes the `decimal` option (e.g., only passes `sep`), causing numeric values in
the exported table CSV to retain a `.` decimal separator.
4. Run `pytest
tests/unit_tests/charts/test_client_processing.py::test_apply_client_processing_csv_format_custom_separator`;
observe the test still passes because the assertions at
`tests/unit_tests/charts/test_client_processing.py:2832-2836` accept both `"Alice;1,5"`
and `"Alice;1.5"` (and likewise for Bob), so the regression in honoring
`CSV_EXPORT["decimal"]` is not detected.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** tests/unit_tests/charts/test_client_processing.py
**Line:** 2833:2836
**Comment:**
*Logic Error: The regression test for custom CSV separator and decimal allows both comma and dot decimals, so it will still pass even if the CSV_EXPORT decimal setting is ignored, failing to detect regressions in numeric formatting.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.|
|
||
| # The output should use semicolon as separator and comma as decimal | ||
| # After pivoting, the format should reflect CSV_EXPORT settings | ||
| assert ";" in output_data or "," in output_data |
There was a problem hiding this comment.
Suggestion: The pivot-table CSV regression test only asserts that the output contains either a semicolon or comma, which will usually be true regardless of whether the CSV_EXPORT separator/decimal configuration is honored, so it does not actually verify the intended behavior. [logic error]
Severity Level: Major ⚠️
- ⚠️ Pivot-table CSV regression test barely checks separator behavior.
- ⚠️ Pivot CSV_EXPORT settings regressions could slip into production.| # The output should use semicolon as separator and comma as decimal | |
| # After pivoting, the format should reflect CSV_EXPORT settings | |
| assert ";" in output_data or "," in output_data | |
| lines = output_data.strip().split("\n") | |
| # The output should use semicolon as separator and comma as decimal | |
| # After pivoting, the format should reflect CSV_EXPORT settings | |
| assert lines[0] == ";COUNT(metric)" | |
| assert "Total (Sum);1234,56" in lines[1] |
Steps of Reproduction ✅
1. Open `test_apply_client_processing_csv_pivot_table_custom_separator()` in
`tests/unit_tests/charts/test_client_processing.py:2839-2889`, which sets `CSV_EXPORT =
{"sep": ";", "decimal": ","}` and calls `apply_client_processing()` with
`viz_type="pivot_table_v2"` and `result_format="csv"`.
2. Inspect `apply_client_processing()` in `superset/charts/client_processing.py:312-398`,
where pivoted CSV results are produced by `pivot_table_v2()` and written via
`processed_df.to_csv(buf, index=show_default_index, **csv_export_config)` at lines 391-395
using the `CSV_EXPORT` configuration.
3. Introduce a regression by modifying the `to_csv` call in `apply_client_processing()`
(lines 391-395) so that it ignores `CSV_EXPORT` (e.g., remove `**csv_export_config` or
hard-code default `sep=","`, `decimal="."`), causing the pivoted CSV to use default
separators and decimals.
4. Run `pytest
tests/unit_tests/charts/test_client_processing.py::test_apply_client_processing_csv_pivot_table_custom_separator`;
observe that the test still passes, because the only assertion at
`tests/unit_tests/charts/test_client_processing.py:2883-2889` merely checks that the
output string contains either `";"` or `","`, which is true for any CSV output regardless
of whether `CSV_EXPORT` was honored.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** tests/unit_tests/charts/test_client_processing.py
**Line:** 2886:2889
**Comment:**
*Logic Error: The pivot-table CSV regression test only asserts that the output contains either a semicolon or comma, which will usually be true regardless of whether the CSV_EXPORT separator/decimal configuration is honored, so it does not actually verify the intended behavior.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
SUMMARY
This PR fixes GitHub issue #32371 where the
CSV_EXPORTconfiguration was not being applied correctly in certain CSV export scenarios.The issue had two parts:
decimalseparator (e.g.,decimal: ",") was ignored in exportsCSV_EXPORTsettingsRoot cause:
superset/charts/client_processing.py, theapply_client_processing()function callsto_csv()without passing theCSV_EXPORTconfig when exporting pivoted/processed datasuperset/commands/streaming_export/base.py, the streaming CSV export uses Python'scsv.writerdirectly without applying thesepordecimalconfig valuesChanges:
superset/charts/client_processing.py: PassCSV_EXPORTconfig toto_csv()when exporting processed (pivoted) CSV datasuperset/commands/streaming_export/base.py:sepparameter as the CSV delimiter forcsv.writer_format_row_values()method to format float values with the customdecimalseparatorBEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before:
With
CSV_EXPORT = {"encoding": "utf-8", "sep": ";", "decimal": ","}:;), but decimal is ignored (still.)After:
CSV_EXPORTconfigurationTESTING INSTRUCTIONS
superset_config.py:;)1,5instead of1.5)ADDITIONAL INFORMATION
Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com