Skip to content

Support backfilling of secondary index #448

@robertpang

Description

@robertpang

Jira Link: DB-2465
Upon adding new indexes to a table that already has data, this feature would enable building these indexes in an online manner, while continuing to serve other traffic. Note that this feature should work across both YSQL and YCQL APIs. It should support:

  • Online builds: Support building the indexes without locking out reads or writes on the table. The index build itself will occur asynchronously.
  • Correctness: After the index builds are completed, they should be consistent with the data in the primary table.
  • Constraint violations: If a problem arises while scanning the table, such as a unique constraint violation in a unique index, the CREATE INDEX command should abort and result in a failure. An aborted index will be cleaned up and deleted. Details (such as which constraints were violated) will be found in the logs.
  • Efficient for large datasets: Index build should occur in a distributed manner (utilizing multiple/all nodes in the cluster) to efficiently handle large datasets.
  • Resilience: The index build should be resilient to failures. The entire build process should not need to restart on a node failure in the cluster.

Prerequisites

Status Feature
Design doc: https://github.com/yugabyte/yugabyte-db/blob/master/architecture/design/online-index-backfill.md
Basic online schema change framework
Multi phase alter table to support online index creation in YB-Master

Phase 1 - simple index backfill

Status Feature
YCQL: Backfill indexes for YCQL indexes (non-unique)
YSQL: Backfill indexes for YSQL indexes (including expression and partial indexes) - IN PROGRESS (target v2.2) #2301

Phase 2 - manageability features

Status Feature
Ability to view background index backfill tasks #3668
Expose backfill metrics (writes/sec being performed on index table, rows/sec being processed from primary table, size of index table, etc)
Ability to throttle the rate of backfill

Phase 3 - constraints and unique indexes

Status Feature
YCQL: Handle unique indexes
YSQL: Handle unique indexes
Backfill should not block on very long running, pending transactions #3471
Handle master failures during the backfill by saving read time #3611

Phase 4 - Other misc improvements

Status Feature
YSQL backfill pagination for large tablets #5326
YSQL backfill throttling #7889
Perf improvements #2615
⬜️ Batch multiple index rebuilds on the same table
⬜️ Enhance YSQL grammar to support a simple, language-level paradigm to view backfills tasks

Analytics

Metadata

Metadata

Labels

area/docdbYugabyteDB core featureskind/enhancementThis is an enhancement of an existing featurepriority/mediumMedium priority issueroadmap-tracking-issueThis issue tracks a major roadmap item, and usually appears in the roadmap list.

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions