Skip to content

Add file list to index metadata #5226

@wjones127

Description

@wjones127

Have index plugins return a list of files (and their sizes) in the index, which can be saved in IndexMetadata.

The benefits of this are:

  • Cold read performance: the index reader can skip the initial HEAD call to find the footer, if we store the size of the index file.
  • Makes index size stats available: We could report the on-disk size of the index in statistics (describe_indices), without having to open the index.

Cleanup currently just uses the _index prefix and the UUID to determine what files to cleanup. But that means if any temporary files were written there they will be ignored. If this field is available we might want to use it for cleanup as well.

https://github.com/lancedb/lance/blob/3940f8401dfbb4b3c0ecd567978d187605726e0c/rust/lance/src/dataset/cleanup.rs#L341-L365

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions