Skip to content

Conversation

@CiaranMn
Copy link
Member

@CiaranMn CiaranMn commented Dec 16, 2025

🌟 What is the purpose of this PR?

Start storing the Node API migrations run and the latest migration state (version number by type) as part of the HASH instance entity (they create system types and update entities to updated types), and skip those already run.


Import Note / Call for ideas
The latest migration state is persisted because the migration logic relies on checking whether the 'next version' (version in state + 1) of a type exists before creating it, where the 'current version' was previously a state object that was updated as each migration ran (i.e. all types start at 1 and get incremented as migrations update them).

The first time this PR is deployed, no migrations will have been skipped, and so we need to maintain this 'start from 1' behaviour, which means an empty migration state. After migrations have run, both (1) the migrations already run and (2) latest versions state will have been persisted, and the next time the API starts up the migrations will be skipped and the next new migration will have the correct versions to increment from.

Apart from the first run, it will be possible to hydrate the 'current type version' from the database. So we can actually get rid of it in favour of just fetching all the versions of all types at the start of the migrations at some future point (assuming there is no other HASH instance which has run migrations PRIOR to the introduction of the skipping logic, which still needs the approach in this PR for the first deployment).

Note also that migrations which are skipped, if they somehow are not skipped in future (e.g. numbering changes), will not be idempotent, because they will have the wrong migration state (e.g. we rename 001 to 001b, it has the latest version of types in migration state, it increments to check if next exists, it doesn't, it creates User v8 but with the properties of V1).

This is a bit suboptimal. The other alternatives are:

  1. Don't 'skip' migrations, but instead have some kind of 'dry run' handling that still runs all migrations to populate the migration state, but if a migration is marked has already run, don't actually do any db writes. This is maybe the second best or even equal to the approach currently taken. It would involve amending the functions that do the db operations (update types, entities) to check if the migration has already run, and simply return if it has.
  2. Some other way of checking if the operation has already happened, e.g. diffing types. This is a bit messy and complicated and involves checking lots of things.

One consequence of this (not rebuilding state by running each migration each time, whether or not it makes changes) is that we can no longer have 'dev' migrations (don't run on prod yet) sitting around between migrations that have already run, and being 'un-deved' later, because the migration state they receive will reflect the latest versions in the db, which might not be what the version numbers should be at the point they fall in the files. I've therefore moved all dev migrations to later numbers and closed a few gaps in the existing migration numbering. All new migrations will now have to be numbered after existing ones (which should be the case anyway).

This PR is designed to speed up start-up time by not bothering to go through the process of checking entities that need upgrading (of which there should be none once a migration has run once).

There are also a couple of changes to handle the fact that the HASH instance might not be the latest version when it's fetched as part of migrations (change exact id to base URL for filtering).

Drive-bys:

  1. Update database reset instructions in the README
  2. Ensure Block Protocol 'query' and 'has-query' types are seeded as part of migrations.

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • does not modify any publishable blocks or libraries, or modifications do not need publishing

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

🛡 What tests cover this?

  • Migrations are run as part of integration tests.

@CiaranMn CiaranMn requested a review from TimDiekmann December 16, 2025 12:18
@cursor
Copy link

cursor bot commented Dec 16, 2025

PR Summary

Introduces persistent migration tracking to speed startup and ensure safe resumes.

  • Adds migrationsCompleted and migrationState to the HASH Instance and saves/loads them in migrateOntologyTypes; skips processed files and persists after each run
  • New migration 022 updates HASH Instance schema and upgrades existing entities (temporary instantiate policies); bumps hashInstance to v2 and updates IDs/baseUrls
  • Updates createHashInstance to accept an explicit entity type ID; adjusts migration 005 to pass it
  • Switches HASH Instance queries and validation to use baseUrl (not exact version) in backend utils; relaxes entity-type check accordingly
  • Seeds external BP query and has-query entity types during initial system types migration
  • Frontend return-types-as-json now returns JSON GraphQL errors on fetch failures
  • README: simplifies local DB reset steps

Written by Cursor Bugbot for commit b0c44a4. This will update automatically on new commits. Configure here.

@github-actions github-actions bot added area/apps > hash* Affects HASH (a `hash-*` app) area/infra Relates to version control, CI, CD or IaC (area) area/apps > hash-api Affects the HASH API (app) area/libs Relates to first-party libraries/crates/packages (area) type/eng > frontend Owned by the @frontend team type/eng > backend Owned by the @backend team area/apps labels Dec 16, 2025
@codecov
Copy link

codecov bot commented Dec 16, 2025

Codecov Report

❌ Patch coverage is 0% with 69 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.97%. Comparing base (94ee310) to head (b0c44a4).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...tem-graph-is-initialized/migrate-ontology-types.ts 0.00% 40 Missing ⚠️
...migrations-completed-to-hash-instance.migration.ts 0.00% 21 Missing ⚠️
...grations/001-create-hash-system-types.migration.ts 0.00% 6 Missing ⚠️
...ate-hash-system-entities-and-web-bots.migration.ts 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8187      +/-   ##
==========================================
- Coverage   59.71%   58.97%   -0.74%     
==========================================
  Files        1214     1188      -26     
  Lines      115203   112471    -2732     
  Branches     5062     4942     -120     
==========================================
- Hits        68793    66333    -2460     
+ Misses      45608    45380     -228     
+ Partials      802      758      -44     
Flag Coverage Δ
apps.hash-ai-worker-ts 1.41% <ø> (ø)
apps.hash-api 0.00% <0.00%> (ø)
local.hash-isomorphic-utils 0.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@graphite-app graphite-app bot requested review from a team and removed request for a team December 16, 2025 13:01
@CiaranMn CiaranMn marked this pull request as draft December 16, 2025 13:02
@CiaranMn CiaranMn removed the request for review from TimDiekmann December 16, 2025 13:02
This ensures migrations are idempotent by populating the cache with existing ontology types before applying migrations.

Co-authored-by: c <c@hash.ai>
@vercel vercel bot temporarily deployed to Preview – petrinaut December 16, 2025 14:38 Inactive
@cursor
Copy link

cursor bot commented Dec 16, 2025

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@vercel vercel bot temporarily deployed to Preview – petrinaut January 12, 2026 18:10 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut January 12, 2026 19:27 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut January 12, 2026 19:28 Inactive
@CiaranMn CiaranMn marked this pull request as ready for review January 12, 2026 19:32
@augmentcode
Copy link

augmentcode bot commented Jan 12, 2026

🤖 Augment PR Summary

Summary: This PR makes Node API ontology migrations resumable and faster by persisting which migrations have run (and the accumulated type-version state) onto the HASH Instance entity.

Changes:

  • Adds a new migration to extend the HASH Instance entity type (bumping to hash-instance/v/2) with migrationsCompleted and migrationState properties, and upgrades existing instance entities accordingly.
  • Updates migrateOntologyTypes to load the persisted state on startup, skip migrations whose numbers are already recorded, and save state after each successful migration.
  • Adjusts HASH Instance creation and lookup to work across entity type versions (match by baseUrl rather than a single versioned URL), and threads the current hash instance entityTypeId into creation.
  • Moves/renumbers existing migration files (including dev migrations) so that new migrations are always appended after already-run ones.
  • Ensures Block Protocol query / has-query types are seeded during bootstrap, and updates relevant Block Protocol IDs from @h to @hash.
  • Simplifies local database reset instructions in the repo README.
  • Adds a defensive GraphQL fetch/JSON parsing error fallback in the frontend middleware.

Technical Notes: Migration progress is stored on the instance entity itself, enabling fast startups by avoiding repeated idempotency checks once migrations have completed.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

@vercel vercel bot temporarily deployed to Preview – petrinaut January 12, 2026 20:00 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut January 12, 2026 20:02 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut January 12, 2026 20:03 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut January 13, 2026 10:52 Inactive
TimDiekmann
TimDiekmann previously approved these changes Jan 16, 2026
Copy link
Member

@TimDiekmann TimDiekmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor things, but nothing blocking. Looks good!

I think the approach is sufficient, it's a similar behavior as we have in the Graph migrations, but we also don't distinguish between dev and prod migrations and enforce continuously incremented migration numbers. The alternative you discussed with the dry-run seems more stable, but I don't know if we require it. I guess we can always change it later and eventually we want to move it to the Graph at some point anyway?

We, however, could consider adding the constraints to the README.

Comment on lines 215 to 220
await saveMigrationState({
context: params.context,
hashInstance,
migrationsCompleted,
migrationState,
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still want to call this when all migrations being skipped?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, 54900d6

@vercel vercel bot temporarily deployed to Preview – petrinaut January 20, 2026 12:35 Inactive
@vercel
Copy link

vercel bot commented Jan 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
ds-theme Error Error Jan 20, 2026 0:39am
hashdotdesign Ready Ready Preview, Comment Jan 20, 2026 0:39am

@github-actions github-actions bot dismissed TimDiekmann’s stale review January 20, 2026 12:35

Your organization requires reapproval when changes are made, so Graphite has dismissed approvals. See the output of git range-diff at https://github.com/hashintel/hash/actions/runs/21171694942

@CiaranMn CiaranMn requested a review from TimDiekmann January 20, 2026 12:35
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/apps > hash* Affects HASH (a `hash-*` app) area/apps > hash-api Affects the HASH API (app) area/apps area/infra Relates to version control, CI, CD or IaC (area) area/libs Relates to first-party libraries/crates/packages (area) type/eng > backend Owned by the @backend team type/eng > frontend Owned by the @frontend team

Development

Successfully merging this pull request may close these issues.

4 participants