Skip to content

fix(migration): fail fast on critical schema gaps#1326

Open
wolfkill wants to merge 1 commit into
Tencent:mainfrom
wolfkill:fix/migration-critical-schema-check
Open

fix(migration): fail fast on critical schema gaps#1326
wolfkill wants to merge 1 commit into
Tencent:mainfrom
wolfkill:fix/migration-critical-schema-check

Conversation

@wolfkill
Copy link
Copy Markdown
Contributor

@wolfkill wolfkill commented May 14, 2026

背景

修复 #1319

Fixes #1319.

当前启动流程里,RunMigrationsWithOptions 失败后只打印 warning 并继续启动。如果 000041 迁移因为 pg_trgm / trigram index / dirty migration 等问题失败,Wiki 入库、任务队列和知识图谱相关功能可能在缺少关键 schema 的情况下继续运行,表现为前端无感知卡住或后台任务静默失败。

修改内容

  • 在迁移失败后增加启动前 schema 校验:如果 schema_migrations.dirty=true,直接拒绝启动并给出修复指引。
  • 校验 Wiki/任务队列依赖的关键表:wiki_pageswiki_log_entriestask_pending_opstask_dead_letters
  • 对 PostgreSQL/ParadeDB 场景补充 pg_trgm 修复提示,便于定位 000041 trigram index 失败。
  • 保留外部迁移工具场景:如果迁移命令失败,但 schema 不 dirty 且关键表都已存在,仍允许继续启动。
  • 增加回归单测覆盖关键表缺失、外部迁移已完成、dirty migration 三种路径。

验证

  • CGO_LDFLAGS='-lc++' go test ./internal/container -run 'TestValidateCriticalSchemaAfterMigrationFailure' -count=1
  • CGO_LDFLAGS='-lc++' go test ./internal/database ./internal/container -count=1
  • CGO_LDFLAGS='-lc++' go test ./internal/application/repository ./internal/application/service -run '^$' -count=1
  • git diff --cached --check
  • Docker Postgres 非破坏性校验:确认 schema_migrations42|false,且 wiki_pageswiki_log_entriestask_pending_opstask_dead_letters 均存在。
  • Docker 依赖下本地启动后端并请求 GET /health,返回 {"status":"ok"};日志显示迁移版本 42、dirty=false、服务成功监听 0.0.0.0:8080

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: 数据库迁移 000041 失败时静默降级,Wiki 和知识图谱功能无感知瘫痪

1 participant