Skip to content

Conversation

@Arunodoy18
Copy link

SUMMARY

Fixes CSV import failures on MySQL databases that enforce primary keys
(e.g. when sql_require_primary_key = ON).

When importing a CSV, Superset generates a CREATE TABLE statement without
a primary key. Databases that require primary keys correctly reject this
operation. This change detects when a primary key is required and
automatically adds a safe auto-increment primary key during table creation,
while preserving existing behavior for databases that do not enforce this
constraint.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Not applicable.

TESTING INSTRUCTIONS

Automated testing:

  • Added integration tests covering:
    • Detection of MySQL primary key requirement
    • Automatic primary key addition during CSV import when required
    • Skipping primary key addition when not required
    • Error-handling behavior for detection failures

Manual testing (optional):

  1. Configure a MySQL database with sql_require_primary_key = ON
  2. Import a CSV file using Superset
  3. Verify that the table is created successfully with an auto-increment
    primary key
  4. Confirm existing CSV import behavior remains unchanged for databases
    without this requirement

ADDITIONAL INFORMATION

@dosubot dosubot bot added data:connect:mysql Related to MySQL data:csv Related to import/export of CSVs labels Jan 29, 2026
Copy link
Contributor

@bito-code-review bito-code-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #528e4b

Actionable Suggestions - 2
  • superset/db_engine_specs/mysql.py - 1
    • Unclosed parenthesis in conditional expression · Line 456-458
  • tests/integration_tests/databases/commands/upload_mysql_pk_test.py - 1
Review Details
  • Files reviewed - 2 · Commit Range: 16de1f8..16de1f8
    • superset/db_engine_specs/mysql.py
    • tests/integration_tests/databases/commands/upload_mysql_pk_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment on lines +456 to +458
needs_primary_key = (
if_exists == "fail" # Only for new table creation
and not to_sql_kwargs.get("index", False) # No index column exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclosed parenthesis in conditional expression

Line 456 has an unclosed opening parenthesis ( in the needs_primary_key assignment. The condition started on line 456 is missing a closing parenthesis before the assignment completes on line 458.

Code suggestion
Check the AI-generated fix before applying
Suggested change
needs_primary_key = (
if_exists == "fail" # Only for new table creation
and not to_sql_kwargs.get("index", False) # No index column exists
needs_primary_key = (
if_exists == "fail" # Only for new table creation
and not to_sql_kwargs.get("index", False) # No index column exists
)

Code Review Run #528e4b


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

if len(execute_calls) >= 2:
alter_call = execute_calls[1][0][0]
# The column name should be modified to avoid conflict
assert "__superset_upload_id__" in alter_call or "___superset_upload_id__" in alter_call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test assertion too permissive

The assertion allows either the original or prefixed column name, but the code always prefixes when there's a conflict, so the test should verify the exact adjusted name.

Code suggestion
Check the AI-generated fix before applying
 -        assert "___superset_upload_id__" in alter_call
 +        assert "___superset_upload_id__" in alter_call and "__superset_upload_id__" not in alter_call

Code Review Run #528e4b


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

@Arunodoy18
Copy link
Author

Thanks for the review!

This change only affects backend CSV import logic.
The failing checks appear to be frontend E2E tests, which may be flaky.
Happy to re-run or adjust if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data:connect:mysql Related to MySQL data:csv Related to import/export of CSVs size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to import CSV into hosted database that enforces primary key ID

1 participant