-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bring ORM models and Alembic migrations in sync #16221
Conversation
CodSpeed Performance ReportMerging #16221 will not alter performanceComparing Summary
|
9edb9bd
to
2fcdc40
Compare
I'll be adding a new PR that builds on this one to move |
bfefcb2
to
7f7d98b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fantastic! Thank you @mjpieters!
In addition to my comment about the dependency cycle warning raised by SQLAlchemy, I'm seeing a number of warnings when generating a migration against a SQLite DB:
/Users/alexander/dev/mjpieters/prefect/.venv/lib/python3.12/site-packages/alembic/ddl/impl.py:817: UserWarning: autogenerate skipping metadata-specified expression-based index 'ix_artifact__key_created_desc'; dialect 'sqlite' under SQLAlchemy 2.0.36 can't reflect these indexes so they can't be compared
util.warn(
/Users/alexander/dev/mjpieters/prefect/.venv/lib/python3.12/site-packages/alembic/ddl/impl.py:817: UserWarning: autogenerate skipping metadata-specified expression-based index 'ix_flow_run__coalesce_start_time_expected_start_time_asc'; dialect 'sqlite' under SQLAlchemy 2.0.36 can't reflect these indexes so they can't be compared
util.warn(
/Users/alexander/dev/mjpieters/prefect/.venv/lib/python3.12/site-packages/alembic/ddl/impl.py:817: UserWarning: autogenerate skipping metadata-specified expression-based index 'ix_flow_run__coalesce_start_time_expected_start_time_desc'; dialect 'sqlite' under SQLAlchemy 2.0.36 can't reflect these indexes so they can't be compared
util.warn(
09:22:00.832 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('ix_flow_run__end_time_desc', <sqlalchemy.sql.elements.UnaryExpression object at 0x106f74650>). The dialect implementation should either skip expression indexes or provide a custom implementation.
09:22:00.832 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('ix_flow_run__expected_start_time_desc', <sqlalchemy.sql.elements.UnaryExpression object at 0x106f74530>). The dialect implementation should either skip expression indexes or provide a custom implementation.
09:22:00.833 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('ix_flow_run__next_scheduled_start_time_asc', <sqlalchemy.sql.elements.UnaryExpression object at 0x106f74ef0>). The dialect implementation should either skip expression indexes or provide a custom implementation.
09:22:00.836 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('uq_flow_run_state__flow_run_id_timestamp_desc', Column('flow_run_id', UUID(), ForeignKey('flow_run.id'), table=<flow_run_state>, nullable=False), <sqlalchemy.sql.elements.UnaryExpression object at 0x106ef1f10>, unique=True). The dialect implementation should either skip expression indexes or provide a custom implementation.
09:22:00.843 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('ix_task_run__end_time_desc', <sqlalchemy.sql.elements.UnaryExpression object at 0x1070200b0>). The dialect implementation should either skip expression indexes or provide a custom implementation.
09:22:00.844 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('ix_task_run__expected_start_time_desc', <sqlalchemy.sql.elements.UnaryExpression object at 0x106f77e60>). The dialect implementation should either skip expression indexes or provide a custom implementation.
09:22:00.844 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('ix_task_run__next_scheduled_start_time_asc', <sqlalchemy.sql.elements.UnaryExpression object at 0x106f77dd0>). The dialect implementation should either skip expression indexes or provide a custom implementation.
09:22:00.847 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('uq_task_run_state__task_run_id_timestamp_desc', Column('task_run_id', UUID(), ForeignKey('task_run.id'), table=<task_run_state>, nullable=False), <sqlalchemy.sql.elements.UnaryExpression object at 0x106ef3560>, unique=True). The dialect implementation should either skip expression indexes or provide a custom implementation.
09:22:00.847 | WARNING | alembic.ddl.impl - Generating approximate signature for index Index('ix_task_run_state_cache__cache_key_created_desc', Column('cache_key', String(), table=<task_run_state_cache>, nullable=False), <sqlalchemy.sql.elements.UnaryExpression object at 0x106f4a1b0>). The dialect implementation should either skip expression indexes or provide a custom implementation.
It looks like you added support for checking ddl_if
in src/prefect/server/database/_migrations/env.py
, so are these indexes that we should add .ddl_if(dialect="postgresql")
to?
7f7d98b
to
24c96bc
Compare
I think we'd have to then also define a |
Right, the issue is the combination of strings and @declared_attr.directive
def __table_args__(cls) -> Iterable[sa.Index]:
return (
sa.Index(
"ix_artifact__key",
cls.key,
),
sa.Index(
"ix_artifact__key_created_desc",
cls.key,
cls.created.desc(),
postgresql_include=[
"id",
"updated",
"type",
"task_run_id",
"flow_run_id",
],
),
) |
Thanks for these contributions @mjpieters ! |
24c96bc
to
dde21b3
Compare
There are two The only alternative would be to not specify the indexes in the ORM models module, and then tell alembic to not drop these indexes when generating a migration. I think that would be worse off, as the indexes then just drop out of sight and people will forget to manage their migrations properly. |
dde21b3
to
4031606
Compare
- Update the alembic `include_object` hook to support per-dialect filtering for indexes, and to avoid enum-to-varchar conversions for existing columns in SQLite. - Add indexes added by manual migrations into their respective ORM models, with dialect-specific information where needed. - Keep index names consistent across PostgreSQL and SQLite. - Drop an index that was accidentally applied to the wrong table in the SQLite migrations. - Drop an index that was created for a foreign key column now that it no longer is a foreign key. - Correct ORM annotations for columns accidentally marked as nullable when they were converted to `Mapped[...]` / `mapped_column()` - Mark `WorkPool.default_queue_id` as a foreign key in the ORM model; it already is one in the database. - Drop columns that were removed from the ORM models. - Use `Optional[...]` instead of `Union[..., None]` in annotations. - Mark the Variable.value column as nullable; NULL represents None.
4031606
to
df3c519
Compare
I have no idea why the Given that my other PR has the same changes plus type annotation changes only, the test must be flakey. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing! Thank you!
As I was looking into low-hanging fruit to improve prefect's type completeness, I ran into several issues with the ORM model definitions. Over time the migrations and the models have diverged somewhat, for various reasons, and this commit brings them back together again.
With this commit, you can run
prefect server database revision --autogenerate
in a clean checkout and alembic will generate empty migration steps. At the same time, the ORM models accurately reflect what is in the database.Changes summary
include_object
hook to support per-dialect filtering for indexes, and to avoid enum-to-varchar conversions for existing columns in SQLite.Mapped[...]
/mapped_column()
WorkPool.default_queue_id
as a foreign key in the ORM model; it already is one in the database.Optional[...]
instead ofUnion[..., None]
in annotations.Changes per ORM model
PostgreSQL-only trigram indexes, defined solely in migration scripts
Each of these indexes is now attached to the ORM and alembic will only consider exporting these for the PostgreSQL migrations.
BlockDocument.name
: trgm_ix_block_document_nameBlockSchema.capablities
: ix_block_schema__capabilitiesBlockType.name
: trgm_ix_block_type_nameDeployment.name
: trgm_ix_deployment_nameFlow.name
: trgm_ix_flow_nameFlowRun.name
: trgm_ix_flow_run_nameTaskRun.name
: trgm_ix_task_run_nameWorkQueue.name
: trgm_ix_work_queue_nameArtifact
ix_artifact__key_created_desc
artifact_collection
table.ArtifactCollection
latest_id
, made NOT NULL in PostgreSQL.ix_artifact__key_created_desc
artifact
table.ix_artifact_collection__updated
BlockDocument
ix_block_document__block_type_name__name
ix_block_document__block_type_name_name
Deployment
schedule
is_schedule_active
manifest_path
flow_data
work_queue_id
DeploymentSchedule
max_active_runs
catchup
FlowRun
Mapped[...]
andmapped_column()
without realising that this changes the nullable setting in many cases.parent_task_run_id
ix_flow_run__deployment_id
ix_flow_run__scheduler_deployment_id_auto_scheduled_next_schedu
FlowRunState
data
Run
Mapped[...]
andmapped_column()
without realising that this changes the nullable setting in many cases.expected_start_time
start_time
end_time
TaskRunState
data
Variable
value
none_as_null=True
, and tests expectNone
to be a valid variable value, so the ORM column was made nullable. Migrations had already made the column nullable in the database.value
Worker
ix_worker__work_pool_id_last_heartbeat_time
ix_worker__work_pool_id_last_heartbeat_time
already covers this column.ix_worker__last_heartbeat_time
WorkQueue
ix_work_queue__work_pool_id_priority
ix_work_queue__work_pool_id_priority
already covers this column.priority
WorkPool
ix_work_pool__type
ix_worker_pool__type
but it was not present in the ORM model before this point either.default_queue_id
towork_queue.id