Skip to content

Optimize local fs listing of transactions #3904

@wjones127

Description

@wjones127

In #3882 we are making conflict resolution a little slower for local filesystems. This is because we are using ObjectStore's list to list all the manifests. That implementation will call stat on each entry, even if it's a file path we don't care about.

Instead we can use a similar strategy we have in:

https://github.com/lancedb/lance/blob/55c86f9b60774a4d36172430c1a9b033d9c5dba9/rust/lance-table/src/io/commit.rs#L333

  1. Create an iterator of ManifestLocation just from the readdir results (will be missing e_tag and size).
  2. Collect just the entries we care about
  3. Then call stat of each of those in paralel to fill in e_tag and size.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions