Skip to content

Migrating from TEXT to JSON in SQLite results in data loss #697

Closed
@tiangolo

Description

@tiangolo

Describe the bug

This is probably a SQLite bug, not really Alembic, but as there are other fixes and workarounds here to handle these cases, I thought it could be fixed here too.

When using a batch operation to migrate from a TEXT type to a JSON type Alembic copies the data using CAST(column_name AS JSON) to select the data to be copied, but SQLite converts any CAST(x AS JSON) to a literal 0.

On the other side, copying the data directly, without CAST() works in SQLite, as it already stores the data as JSON.

Note: The need to migrate while the type is not really changed underneath is to be able to also support MySQL with the same migration.

Expected behavior

I would expect/want data not to be lost. So, a TEXT field in a record containing JSON data would be preserved in a column of type JSON instead of being converted to 0 (of course, that's not Alembic's fault, but SQLite's, but it's still probably fixable here).

It works "correctly" when performing the same operations without the CAST(x AS JSON).

To Reproduce
Please try to provide a Minimal, Complete, and Verifiable example, with the migration script and/or the SQLAlchemy tables or models involved.
See also Reporting Bugs on the website.

import sqlalchemy as sa
from alembic.migration import MigrationContext
from alembic.operations import Operations

engine = sa.create_engine("sqlite:///:memory:", echo=True)
conn = engine.connect()

context = MigrationContext.configure(conn)
op = Operations(context)

metadata = sa.MetaData()

users = sa.Table("users", metadata,
    sa.Column('id', sa.Integer, primary_key=True),
    sa.Column('name', sa.String),
    sa.Column('data', sa.Text)
    )

metadata.create_all(engine)

conn.execute(users.insert(), name="Jack", data='{"message": "Hello World"}')

with op.batch_alter_table("users") as batch_op:
    batch_op.alter_column("data", type_=sa.JSON())

s = sa.select([users])

result = conn.execute(s)

for u in result:
    print(f"{u.name}, {u.data}")

Log

The section that copies the data:

INSERT INTO _alembic_tmp_users (id, name, data) SELECT users.id, users.name, CAST(users.data AS JSON) AS anon_1 
FROM users

Printed out:

Jack, 0

Expected (ideal):

Jack, {"message": "Hello World"}

Versions.

  • OS: Ubuntu 18.04
  • Python: Python 3.7.5
  • Alembic: alembic 1.4.3 (master)
  • SQLAlchemy: 1.3.17
  • Database: SQLite
  • DBAPI: Default

Additional context

Complete logs:

2020-05-22 19:52:28,036 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2020-05-22 19:52:28,036 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,036 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2020-05-22 19:52:28,036 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,038 INFO sqlalchemy.engine.base.Engine PRAGMA main.table_info("users")
2020-05-22 19:52:28,038 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,038 INFO sqlalchemy.engine.base.Engine PRAGMA temp.table_info("users")
2020-05-22 19:52:28,038 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,039 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE users (
        id INTEGER NOT NULL, 
        name VARCHAR, 
        data TEXT, 
        PRIMARY KEY (id)
)


2020-05-22 19:52:28,039 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,039 INFO sqlalchemy.engine.base.Engine COMMIT
2020-05-22 19:52:28,039 INFO sqlalchemy.engine.base.Engine INSERT INTO users (name, data) VALUES (?, ?)
2020-05-22 19:52:28,039 INFO sqlalchemy.engine.base.Engine ('Jack', '{"message": "Hello World"}')
2020-05-22 19:52:28,039 INFO sqlalchemy.engine.base.Engine COMMIT
2020-05-22 19:52:28,040 INFO sqlalchemy.engine.base.Engine PRAGMA main.table_info("users")
2020-05-22 19:52:28,040 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,041 INFO sqlalchemy.engine.base.Engine SELECT sql FROM  (SELECT * FROM sqlite_master UNION ALL   SELECT * FROM sqlite_temp_master) WHERE name = 'users' AND type = 'table'
2020-05-22 19:52:28,041 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,041 INFO sqlalchemy.engine.base.Engine PRAGMA main.foreign_key_list("users")
2020-05-22 19:52:28,041 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,041 INFO sqlalchemy.engine.base.Engine PRAGMA temp.foreign_key_list("users")
2020-05-22 19:52:28,041 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,042 INFO sqlalchemy.engine.base.Engine SELECT sql FROM  (SELECT * FROM sqlite_master UNION ALL   SELECT * FROM sqlite_temp_master) WHERE name = 'users' AND type = 'table'
2020-05-22 19:52:28,042 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,042 INFO sqlalchemy.engine.base.Engine PRAGMA main.index_list("users")
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine PRAGMA temp.index_list("users")
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine PRAGMA main.index_list("users")
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine PRAGMA temp.index_list("users")
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine SELECT sql FROM  (SELECT * FROM sqlite_master UNION ALL   SELECT * FROM sqlite_temp_master) WHERE name = 'users' AND type = 'table'
2020-05-22 19:52:28,043 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,045 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE _alembic_tmp_users (
        id INTEGER NOT NULL, 
        name VARCHAR, 
        data JSON, 
        PRIMARY KEY (id)
)


2020-05-22 19:52:28,045 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,045 INFO sqlalchemy.engine.base.Engine COMMIT
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine INSERT INTO _alembic_tmp_users (id, name, data) SELECT users.id, users.name, CAST(users.data AS JSON) AS anon_1 
FROM users
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine COMMIT
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine 
DROP TABLE users
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine COMMIT
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine ALTER TABLE _alembic_tmp_users RENAME TO users
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine ()
2020-05-22 19:52:28,046 INFO sqlalchemy.engine.base.Engine COMMIT
2020-05-22 19:52:28,047 INFO sqlalchemy.engine.base.Engine SELECT users.id, users.name, users.data 
FROM users
2020-05-22 19:52:28,047 INFO sqlalchemy.engine.base.Engine ()
Jack, 0

Note: I'm also submitting a PR addressing this.

Have a nice day!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions