Skip to content

Don't add committed rows to freshly created indexes #1525

@kazimuth

Description

@kazimuth

The FIXME here:

// NOTE: Also add all the rows in the already committed table to the index.
// FIXME: Is this correct? Index scan iterators (incl. the existing `Locking` versions)
// appear to assume that a table's index refers only to rows within that table,
// and does not handle the case where a `TxState` index refers to `CommittedState` rows.
if let Some(committed_table) = commit_table {
insert_index.build_from_rows(&index.columns, committed_table.scan_rows(commit_blob_store))?;
}

is a problem.

This came up while trying to implement #1523. 1523 is just a convenience, so this isn't a huge deal that it's blocked. However, this is going to be a much bigger problem for implementing spacetime migrate.

The issue is that, when we create an index during the transaction, we populate the index with rows in the committed state, as well as rows in the transaction state. This makes it impossible to read the row corresponding to an index entry in general! The row could be in one of two places, the committed state, or the transaction state. But this bit is not stored anywhere.

This is fixed once the transaction is committed -- the transaction state is merged into the committed state and the index is repopulated:

// Add all newly created indexes to the committed state.
for (cols, mut index) in tx_table.indexes {
if !commit_table.indexes.contains_key(&cols) {
index.clear();
commit_table.insert_index(commit_blob_store, cols, index);
}
}

It's only an issue when performing queries during a transaction that added an index. This will mainly come up during migrations.

cc @gefjon

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions