-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove tags_list
from models and ingestion_server
#956
Conversation
Full-stack documentation: Ready https://WordPress.github.io/openverse/_preview/956 Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again. You can check the GitHub pages deployment action list to see the current status of the deployments. |
@obulat Why doesn't this change require a migration (removing the field)? |
Size Change: -1 B (0%) Total Size: 848 kB ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I do think the migration name can be simplified to 0053_remove_tags_list.py
but that's not worth blocking the PR over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change in general is fine (removing the deprecated column, etc), but this PR doesn't work for zero-downtime and would cause the search endpoints to 500 if deployed:
openverse-web-1 | [2023-03-22 21:46:26,906 - django.db.backends - 131][DEBUG] [22c5b7f7a5744e989375e975e2241468] (0.001) SELECT "image"."id", "image"."created_on", "image"."updated_on", "image"."identifier", "image"."foreign_identifier", "image"."title", "image"."foreign_landing_url", "image"."creator", "image"."creator_url", "image"."thumbnail", "image"."provider", "image"."url", "image"."filesize", "image"."filetype", "image"."watermarked", "image"."license", "image"."license_version", "image"."source", "image"."last_synced_with_source", "image"."removed_from_source", "image"."view_count", "image"."tags", "image"."tags_list", "image"."category", "image"."meta_data", "image"."width", "image"."height", "api_matureimage"."created_on", "api_matureimage"."identifier" FROM "image" LEFT OUTER JOIN "api_matureimage" ON ("image"."identifier" = "api_matureimage"."identifier") WHERE "image"."identifier" IN ('2982c0bf-9c26-4543-a712-4d50a117ded8'::uuid, '29d57e46-51f5-43e5-adbb-7841e7e8cc9c'::uuid, '2a48ea01-3ca3-4c5a-a106-a6e10d8938cf'::uuid, '2a7f6797-a1b1-43ac-ac32-e70b76185d9c'::uuid, '2cffa611-44a5-4481-ab19-04ac99420cad'::uuid, '2d04f852-4c9d-4455-895c-2b740e189164'::uuid, '2d1e593a-7214-4e09-a3ac-819cad604150'::uuid, '2dceed3d-78fa-4668-a008-9b53eeb8600e'::uuid, '2f67c4d8-8a25-4590-946b-85e62a99ac91'::uuid, '2f9631a6-41ae-497e-b5dc-a606e0e1d564'::uuid, '2fc59848-7bc6-4ba6-be1c-cb10f47f4a42'::uuid, '2fd6e7be-ea8a-4e2e-b216-21166a45aef9'::uuid, '3067904e-31d4-4252-aff7-c1b126e191fd'::uuid, '33639f6a-c0ae-426e-95f5-44ceeead2661'::uuid, '35ba97dc-0e5e-4eaf-99fe-9a0e0d824940'::uuid, '363f86e5-0c43-4038-bf4d-28fc1aa43175'::uuid, '36675f14-f34f-46d9-9094-fc1d89f762b0'::uuid, '36d19812-53ac-479c-b43b-191e708c5b85'::uuid, '37172cc8-46f4-46a9-82bf-19f88b2c036e'::uuid, '3783c73d-4d48-4199-8d1c-3fa8b0ee1d00'::uuid) ORDER BY "image"."created_on" DESC; args=(UUID('2982c0bf-9c26-4543-a712-4d50a117ded8'), UUID('29d57e46-51f5-43e5-adbb-7841e7e8cc9c'), UUID('2a48ea01-3ca3-4c5a-a106-a6e10d8938cf'), UUID('2a7f6797-a1b1-43ac-ac32-e70b76185d9c'), UUID('2cffa611-44a5-4481-ab19-04ac99420cad'), UUID('2d04f852-4c9d-4455-895c-2b740e189164'), UUID('2d1e593a-7214-4e09-a3ac-819cad604150'), UUID('2dceed3d-78fa-4668-a008-9b53eeb8600e'), UUID('2f67c4d8-8a25-4590-946b-85e62a99ac91'), UUID('2f9631a6-41ae-497e-b5dc-a606e0e1d564'), UUID('2fc59848-7bc6-4ba6-be1c-cb10f47f4a42'), UUID('2fd6e7be-ea8a-4e2e-b216-21166a45aef9'), UUID('3067904e-31d4-4252-aff7-c1b126e191fd'), UUID('33639f6a-c0ae-426e-95f5-44ceeead2661'), UUID('35ba97dc-0e5e-4eaf-99fe-9a0e0d824940'), UUID('363f86e5-0c43-4038-bf4d-28fc1aa43175'), UUID('36675f14-f34f-46d9-9094-fc1d89f762b0'), UUID('36d19812-53ac-479c-b43b-191e708c5b85'), UUID('37172cc8-46f4-46a9-82bf-19f88b2c036e'), UUID('3783c73d-4d48-4199-8d1c-3fa8b0ee1d00')); alias=default
openverse-web-1 | [2023-03-22 21:46:26,906 - django.request - 241][ERROR] [22c5b7f7a5744e989375e975e2241468] Internal Server Error: /v1/images/
openverse-web-1 | Traceback (most recent call last):
openverse-web-1 | File "/venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 89, in _execute
openverse-web-1 | return self.cursor.execute(sql, params)
openverse-web-1 | psycopg2.errors.UndefinedColumn: column image.tags_list does not exist
openverse-web-1 | LINE 1: ...om_source", "image"."view_count", "image"."tags", "image"."t...
To reproduce this locally, checkout this branch and run just api/dj migrate
. Then checkout main and just up
. just logs web
to tail the logs and make an images request.
This approximates the situation during a deployment because we'll have applied the migration to remove the tags_list from the model but the previous version of the application will still be running.
We probably need to make a change to the serializer to remove the usage of tags_list first, and then we can deploy a follow-up PR to remove the attribute from the model. For zero-downtime we always need to remove the usages before removing the column. It looks like in this case the usage is hidden away in a non-obvious area 😰
After re-reading this guide we found earlier I remembered that it's actually an easy solution for this and that the migration check action might get in the way of safely deploying these changes (:scream:). We first need to have a PR that only removes the attribute from the model but does not include the migration. Once that is deployed, then we can deploy a follow-up PR that has the migration to remove the column. Basically: we'd split this PR into two, with the second only including the migration and this one having everything else. |
Does a PR deploy automatically run the migrations? Because if not they could be merged in together and then the migrations could be executed later on, separately. |
They do. Whether to do this or not was a long discussed topic during the ECS migration. Doing as you suggested, running the migration after the deploy, only solves the case when you want to remove a column. If you add a model attribute without having run the migration to add a column, you run into the same issue (trying to select a column that doesn't exist). There are several complexities that such an approach would bring up. For one, we'd have to follow different processes for different types of migrations (either running them before deploying the app or after). Doing so would recreate the core complexity of our old deployment approach. Updating Terraform version numbers was time-consuming but trivial. Running migrations and deciding when to run them? That was a nightmare. Zero-downtime migrations have many benefits laid out in the zero-downtime migration documentation, not least of which that it promotes a consistent approach to database schema change management. Following this approach, we don't have to think "how/when do I run this migration"? We just have to keep in mind the guidelines for zero-downtime migrations and organise our schema changes to be compatible with zero-downtime migrations. This process is documented already in the Openverse docs, including a link to the above linked guide for how to do different changes with respect to Django specifically. |
I am really confused about how to find this usage. I will be investigating it, but any help is very welcome! |
6d87622
to
5b74aef
Compare
@obulat it seems like the usage of This would prevent One thing to consider in all this, though - I have no idea how long deleting the |
5b74aef
to
bc9c1e6
Compare
It looks like this will need to be targeted against |
e746823
to
b7db13a
Compare
b7db13a
to
b5d2ab3
Compare
b5d2ab3
to
f5c00c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! I tested that the migrations runs and doesn't cause issues on main
afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
I removed the
Merging this PR now |
f5c00c6
to
7b39586
Compare
This PR has migrations. Please rebase it before merging to ensure that conflicting migrations are not introduced. |
@obulat Why was it necessary to run the column removal manually in staging and production rather than relying on the migrations during deployment? 🤔 |
1 similar comment
@obulat Why was it necessary to run the column removal manually in staging and production rather than relying on the migrations during deployment? 🤔 |
The fact that the column was explicitly deleted by a command that was run outside of Django might mean that the next deployment will fail because Django is expecting the column to be present when that migration is run. It might be necessary to re-add the columns, I'm not sure though. It seems like staging deployments are succeeding 🤷🏼♀️ |
You are right, @AetherUnbound, this did cause problems. We solved it by re-adding the column back, and then deploying the API. |
Got it! Was production deployed successfully after reading the column as well? |
Needs #1029 to be deployed and the
tags_list
column is dropped from the databaseFixes
Fixes #704 by @obulat
Description
This PR removes
tags_list
from models and from the ingestion_server. This should be done after the column is dropped from the database.Testing Instructions
If you run #1029, and then repeat the steps outlined in the comment with this PR rebased onto #1029 and #1029 (instead of this PR and main), you should not see errors in logs.
Checklist
Update index.md
).main
) ora parent feature branch.
errors.
Developer Certificate of Origin
Developer Certificate of Origin