-
Notifications
You must be signed in to change notification settings - Fork 110
H-3572, H-4105: Track Node migrations run, skip already-run migrations #8187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
PR SummaryIntroduces persistent migration tracking to speed startup and ensure safe resumes.
Written by Cursor Bugbot for commit b0c44a4. This will update automatically on new commits. Configure here. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8187 +/- ##
==========================================
- Coverage 59.71% 58.97% -0.74%
==========================================
Files 1214 1188 -26
Lines 115203 112471 -2732
Branches 5062 4942 -120
==========================================
- Hits 68793 66333 -2460
+ Misses 45608 45380 -228
+ Partials 802 758 -44
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
apps/hash-api/src/graph/ensure-system-graph-is-initialized/migrate-ontology-types.ts
Show resolved
Hide resolved
This ensures migrations are idempotent by populating the cache with existing ontology types before applying migrations. Co-authored-by: c <c@hash.ai>
|
Cursor Agent can help with this pull request. Just |
🤖 Augment PR SummarySummary: This PR makes Node API ontology migrations resumable and faster by persisting which migrations have run (and the accumulated type-version state) onto the HASH Instance entity. Changes:
Technical Notes: Migration progress is stored on the instance entity itself, enabling fast startups by avoiding repeated idempotency checks once migrations have completed. 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...migrate-ontology-types/migrations/022-add-migrations-completed-to-hash-instance.migration.ts
Outdated
Show resolved
Hide resolved
apps/hash-api/src/graph/ensure-system-graph-is-initialized/migrate-ontology-types.ts
Outdated
Show resolved
Hide resolved
TimDiekmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor things, but nothing blocking. Looks good!
I think the approach is sufficient, it's a similar behavior as we have in the Graph migrations, but we also don't distinguish between dev and prod migrations and enforce continuously incremented migration numbers. The alternative you discussed with the dry-run seems more stable, but I don't know if we require it. I guess we can always change it later and eventually we want to move it to the Graph at some point anyway?
We, however, could consider adding the constraints to the README.
| await saveMigrationState({ | ||
| context: params.context, | ||
| hashInstance, | ||
| migrationsCompleted, | ||
| migrationState, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still want to call this when all migrations being skipped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, 54900d6
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Your organization requires reapproval when changes are made, so Graphite has dismissed approvals. See the output of git range-diff at https://github.com/hashintel/hash/actions/runs/21171694942
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
🌟 What is the purpose of this PR?
Start storing the Node API migrations run and the latest migration state (version number by type) as part of the HASH instance entity (they create system types and update entities to updated types), and skip those already run.
Import Note / Call for ideas
The latest migration state is persisted because the migration logic relies on checking whether the 'next version' (version in state + 1) of a type exists before creating it, where the 'current version' was previously a state object that was updated as each migration ran (i.e. all types start at 1 and get incremented as migrations update them).
The first time this PR is deployed, no migrations will have been skipped, and so we need to maintain this 'start from 1' behaviour, which means an empty migration state. After migrations have run, both (1) the migrations already run and (2) latest versions state will have been persisted, and the next time the API starts up the migrations will be skipped and the next new migration will have the correct versions to increment from.
Apart from the first run, it will be possible to hydrate the 'current type version' from the database. So we can actually get rid of it in favour of just fetching all the versions of all types at the start of the migrations at some future point (assuming there is no other HASH instance which has run migrations PRIOR to the introduction of the skipping logic, which still needs the approach in this PR for the first deployment).
Note also that migrations which are skipped, if they somehow are not skipped in future (e.g. numbering changes), will not be idempotent, because they will have the wrong migration state (e.g. we rename 001 to 001b, it has the latest version of types in migration state, it increments to check if next exists, it doesn't, it creates User v8 but with the properties of V1).
This is a bit suboptimal. The other alternatives are:
One consequence of this (not rebuilding state by running each migration each time, whether or not it makes changes) is that we can no longer have 'dev' migrations (don't run on prod yet) sitting around between migrations that have already run, and being 'un-deved' later, because the migration state they receive will reflect the latest versions in the db, which might not be what the version numbers should be at the point they fall in the files. I've therefore moved all dev migrations to later numbers and closed a few gaps in the existing migration numbering. All new migrations will now have to be numbered after existing ones (which should be the case anyway).
This PR is designed to speed up start-up time by not bothering to go through the process of checking entities that need upgrading (of which there should be none once a migration has run once).
There are also a couple of changes to handle the fact that the HASH instance might not be the latest version when it's fetched as part of migrations (change exact id to base URL for filtering).
Drive-bys:
Pre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
🛡 What tests cover this?