fix: tolerate dangling hidden index metadata in DropDatabase#24019
Conversation
## What type of PR is this? - [x] Bug fix ## Which issue(s) does this PR fix? Fixes matrixorigin#23985 ## What this PR does / why we need it DROP DATABASE could panic with makeslice: cap out of range when table constraint metadata still referenced hidden index tables that were missing from the current relation list. The old code preallocated deleteTables with len(relations)-len(ignoreTables). When internal index metadata was duplicated or dangling, that capacity became negative and crashed the process during DROP DATABASE, which also surfaced through DROP ACCOUNT. This change makes the cleanup path resilient by de-duplicating hidden index table names, filtering the relation list with a set, and warning when dangling hidden index metadata is detected instead of panicking. ## What testing was done - source ~/.zshrc && moenv && go test ./pkg/sql/compile -run TestScope_DropDatabase_ToleratesDanglingHiddenIndexMetadata -count=1 - source ~/.zshrc && moenv && go test ./pkg/sql/compile -count=1 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR claims to fix a DROP DATABASE panic (dangling hidden index metadata), but the provided diffs primarily introduce/modify CDC reader v2 utilities, backup enhancements (protection + global file index), dev/docker/observability tooling, and a Go/toolchain/version bump—none of which appear related to Scope.DropDatabase.
Changes:
- Added multiple CDC reader v2 components (state tracking, change collection) and related tests.
- Enhanced backup flow with backup-protection management and a global file index to skip already-backed-up files.
- Updated dev tooling/docs/configs (Dockerfiles, multi-CN compose env, Grafana dashboard CLI) and bumped Go/module versions.
Reviewed changes
Copilot reviewed 80 out of 927 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/cdc/reader_v2_state_test.go | Adds unit tests for CDC reader v2 state/tracker utilities |
| pkg/cdc/reader_v2_state.go | Introduces reader state enums, transaction tracker, and state manager |
| pkg/cdc/reader_v2_data_processor_empty_snapshot_test.go | Adds regression test ensuring empty snapshot batches are skipped |
| pkg/cdc/reader_v2_data_processor.go.tmp | Adds a (temporary-named) DataProcessor implementation used by new tests |
| pkg/cdc/reader_v2_change_collector.go | Adds a wrapper around engine.ChangesHandle to produce typed change batches |
| pkg/cdc/reader_test_helpers.go | Adds engine.ChangesHandle test helper |
| pkg/cdc/observability_manager.go | Adds singleton manager for CDC observability/progress monitoring |
| pkg/cdc/error_handler_test.go | Adds tests for error metadata parsing/formatting/retry logic |
| pkg/cdc/error_handler.go | Implements error metadata parsing/formatting and retry gating |
| pkg/cdc/error_consumption_test.go | Adds broader tests simulating error consumption/retry behavior |
| pkg/catalog/types.go | Adds new system table constant MO_BRANCH_METADATA |
| pkg/bootstrap/versions/v3_0_0/cluster_upgrade_list.go | Adds upgrade entry to create mo_branch_metadata |
| pkg/bootstrap/versions/upgrade_strategy.go | Disables logging in metadata-check queries during upgrade checks |
| pkg/bootstrap/service_test.go | Updates test txn operator method names/signatures |
| pkg/backup/tae.go | Adds backup GC-protection manager + global file index integration |
| pkg/backup/file_index_test.go | Adds tests for global file index loading/parsing |
| pkg/backup/file_index.go | Implements global file index discovery/loading from backup root |
| pkg/backup/backup_test.go | Adjusts tests for new execBackup parameters |
| pkg/backup/backup.go | Loads global file index and passes into backup flow |
| optools/images/Dockerfile.dev | Adds dev Dockerfile that vendors deps and builds inside container |
| optools/images/Dockerfile | Bumps builder base image version |
| optools/compose_bvt/Dockerfile.tester | Bumps tester image version |
| optools/bvt_ut/Dockerfile | Bumps tester image version |
| go.mod | Bumps Go version and multiple module dependency versions |
| etc/launch/tn.toml | Enables observability metrics block in launch config |
| etc/launch/log.toml | Enables observability metrics block in launch config |
| etc/launch/cn.toml | Adds CN engine prefetch patterns and observability block |
| etc/launch-minio-local/tn.toml | Adds new MinIO-based local launch TN config |
| etc/launch-minio-local/log.toml | Adds new MinIO-based local launch LOG config |
| etc/launch-minio-local/launch.toml | Adds MinIO local launch aggregation file |
| etc/launch-minio-local/docker-compose.yml | Adds MinIO + bucket-init compose for local dev |
| etc/launch-minio-local/cn.toml | Adds new MinIO-based local launch CN config |
| etc/launch-minio-local/README.md | Documents MinIO-local dev workflow |
| etc/launch-minio-local/.gitignore | Ignores MinIO and runtime data directories |
| etc/docker-multi-cn-local-disk/update-docker-mirror.sh | Adds script to update docker registry mirrors |
| etc/docker-multi-cn-local-disk/start.sh | Adds enhanced compose startup script with permission fixes |
| etc/docker-multi-cn-local-disk/setup-docker-mirror.sh | Adds script to configure docker registry mirrors |
| etc/docker-multi-cn-local-disk/prometheus.yml | Adds Prometheus config for docker cluster metrics |
| etc/docker-multi-cn-local-disk/prometheus-local.yml | Adds Prometheus config for host-based MatrixOne metrics |
| etc/docker-multi-cn-local-disk/launch.toml | Adds launch aggregation config for docker-multi-cn env |
| etc/docker-multi-cn-local-disk/grafana-provisioning/datasources/prometheus.yml | Adds Grafana datasource for local Prometheus target |
| etc/docker-multi-cn-local-disk/grafana-provisioning/datasources/prometheus-cluster.yml | Adds Grafana datasource for in-cluster Prometheus |
| etc/docker-multi-cn-local-disk/grafana-provisioning/dashboards/default.yml | Adds Grafana dashboard provisioning config |
| etc/docker-multi-cn-local-disk/edit-config.sh | Adds interactive config.env editor and TOML regeneration |
| etc/docker-multi-cn-local-disk/docker-compose.yml | Adds full multi-service compose with optional monitoring profiles |
| etc/docker-multi-cn-local-disk/config.env.example | Adds example override file for dev configs |
| etc/docker-multi-cn-local-disk/COMMANDS.md | Documents dev make dev-* commands for the environment |
| cmd/mo-tool/main.go | Adds dashboard subcommand to mo-tool |
| cmd/mo-service/main_test.go | Drops legacy build tag line |
| cmd/mo-service/launch.go | Adjusts logging messages/keys |
| cmd/mo-service/debug.go | Adjusts profile logging and CPU profile behavior |
| cmd/mo-service/config.go | Adjusts log output to structured logging |
| cmd/mo-dashboard/main.go | Adds mo-tool dashboard command for Grafana dashboard management |
| cgo/lib.go | Adds -lm linkage for cgo build |
| cgo/Makefile | Adds CPU-specific -march=haswell for x86_64 builds |
| README_CN.md | Reworks CN README sections to include tutorials and dev env docs |
| README.md | Adds tutorials + install/deploy sections and dev env pointers |
| CODEOWNERS | Substantially changes ownership mappings |
| .golangci.yml | Adjusts depguard and staticcheck configuration |
| .github/workflows/robot.yml | Pins reusable workflow to release/3.0-dev |
| .github/workflows/release.yml | Pins reusable workflow to release/3.0-dev |
| .github/workflows/merge-update-moc.yaml | Pins reusable workflow to release/3.0-dev |
| .github/workflows/merge-trigger-tke.yaml | Pins reusable workflow to release/3.0-dev |
| .github/workflows/merge-trigger-standalone.yaml | Pins reusable workflow to release/3.0-dev |
| .github/workflows/image-build.yml | Pins reusable workflow to release/3.0-dev |
| .github/workflows/entrypoint.yaml | Pins actions/workflows to release/3.0-dev |
| .dockerignore | Reworks ignore rules for build context |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The new DropDatabase regression test exercises the 3.0-dev cleanup path far enough to hit TxnOperator.SnapshotTS() and TxnOperator.TxnRef(). The shared compile test txn helpers only mocked Snapshot(), which made CI fail before the regression test could reach its intended assertion. Add SnapshotTS() and TxnRef() expectations to the shared compile txn helpers so helper-based tests can cover 3.0-dev DropDatabase behavior without unexpected mock failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@mergify requeue |
|
|
@mergify requeue |
Merge Queue Status
This pull request spent 9 seconds in the queue, including 1 second running CI. Required conditions to merge
|
What type of PR is this?
Which issue(s) does this PR fix?
Fixes #23985
What this PR does / why we need it:
DROP DATABASEcould panic with:when table constraint metadata still referenced hidden index tables that
were missing from the current relation list. This panic also surfaced
through
DROP ACCOUNT, because tenant cleanup executes internalDROP DATABASE IF EXISTS ...statements.Root Cause
Scope.DropDatabasecollected hidden index table names intoignoreTablesand then preallocateddeleteTableswith:If the catalog was already partially inconsistent (for example, hidden
index table metadata was duplicated or dangling),
len(ignoreTables)could exceed
len(relations). That made the capacity negative andcrashed the process instead of letting cleanup continue.
Fix
relationsagainst that set instead of using nested slice scansthat are no longer present in the database relation list
DROP DATABASEcleanup path instead of panickingTest
Added a regression test:
TestScope_DropDatabase_ToleratesDanglingHiddenIndexMetadataThe test reproduces the inconsistent metadata shape that previously
triggered the panic and verifies that
DropDatabaseno longer crashes.What testing was done