Skip to content

MySQL -> DuckDB export causes segfault  #90

Closed
@benjamingr

Description

What happens?

When running MySQL -> DuckDB import on a simple setup (testcontainers) a segfault occurs.

To Reproduce

When setting up test containers with duckdb to evaluate MySQL export as such (full fixture setup code):

def populate_mysql(mysql: MySqlContainer):
    connection = pymysql.connect(
        host=mysql.get_container_host_ip(),
        port=int(mysql.get_exposed_port(3306)),
        user=mysql.username,
        password=mysql.password,
        database=f"test",
        autocommit=True
    )

    with connection.cursor() as cursor:
        for table_idx in range(TABLE_COUNT):
            # Create users table with columns "id", "email", "password_hash", "credit_card_number" with fake PII
            cursor.execute(
                f"CREATE TABLE users{table_idx} (id INT AUTO_INCREMENT PRIMARY KEY, email TEXT, password_hash TEXT, credit_card_number TEXT);")
            # Insert some fake data
            for i in range(100):
                cursor.execute(
                    f"INSERT INTO users{table_idx} (email, password_hash, credit_card_number) VALUES ('user{i}@example.com', 'hash{i}', '1111-2222-3333-4444');")
            # Create orders table with columns "id", "user_id", "order_date", "order_total", "address" with fake PII
            cursor.execute(
                f"CREATE TABLE orders{table_idx} (id INT AUTO_INCREMENT PRIMARY KEY, user_id INT, order_date TIMESTAMP, order_total DECIMAL(10,2), address TEXT);")
            # Insert some fake transactions
            for i in range(100):
                cursor.execute(
                    f"INSERT INTO orders{table_idx} (user_id, order_date, order_total, address) VALUES ({(i + 10) % 10}, '2022-01-01 00:00:00', 100.00, '1234 Main St, Springfield, IL 62701');")
    connection.close()

And then connecting to the testcontainer and performing "export from" on it (similar PostgreSQL code works):

INSTALL mysql;
LOAD mysql;

ATTACH 'database=test user=test password=test host=localhost port=60057' AS d (TYPE MYSQL, READ_ONLY);

--- Executing this line causes a segfault, crashes DBeaver (or Python when running Python code)
COPY FROM DATABASE d TO memory;

OS:

macOS 14.3 (23D56)

DuckDB Version:

latest ( "1.1.1" )

DuckDB Client:

Python (also with DBeaver)

Hardware:

M3 max

Full Name:

Benjamin Gruenbaum

Affiliation:

Eon.io

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions