[Proposal][C++]: Add LRU chunk cache to Arrow chunk readers to avoid redundant file I/O



## Background

<img width="750" height="335" alt="Image" src="https://github.com/user-attachments/assets/51fdcba8-f3de-47ae-bf6f-0ab0ec7a8e6f" />

Currently, all Arrow chunk readers (
`VertexPropertyArrowChunkReader`, `AdjListArrowChunkReader`, `AdjListOffsetArrowChunkReader`, `AdjListPropertyArrowChunkReader`) discard the loaded `chunk_table_ `every time the chunk position changes via `seek(), next_chunk(), or seek_chunk_index().` This means that if a user seeks back to a previously loaded chunk, the entire Parquet file must be re-opened, metadata parsed, and data decoded again — even though the data hasn't changed.

This is particularly costly in graph traversal workloads (BFS, PageRank, label filtering) where vertex/edge access patterns exhibit strong locality, causing the same chunks to be read repeatedly.

## Proposal

Introduce a generic` LruCache<Key, Value> `and integrate it into all four chunk reader classes. When a chunk is loaded from disk, it is stored in the cache. On subsequent seeks to the same chunk, the cached `arrow::Table `is returned directly, avoiding file I/O entirely.




### Component(s)

C++

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal][C++]: Add LRU chunk cache to Arrow chunk readers to avoid redundant file I/O #860

Background

Proposal

Component(s)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Proposal][C++]: Add LRU chunk cache to Arrow chunk readers to avoid redundant file I/O #860

Description

Background

Proposal

Component(s)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions