[ENH]: More precise calculation of hnsw cache item weight #4657

sanketkedia · 2025-05-28T21:18:15Z

Description of changes

Summarize the changes made by this PR.

Improvements & Bug fixes
- We undercalculate the weight of hnsw cache item. This PR makes it a tad more accurate by accounting for the graph nodes and edges and also levels other than L0
New functionality
- ...

Test plan

How are these changes tested?

Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

None

github-actions · 2025-05-28T21:18:24Z

sanketkedia · 2025-05-28T21:18:31Z

[ENH]: More precise calculation of hnsw cache item weight #4657 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

propel-code-bot · 2025-05-28T21:22:24Z

This PR updates the weight estimation logic for HnswIndexRef cache items to more accurately account for the memory footprint, incorporating not just the L0 embeddings but also graph nodes, edges, and higher HNSW levels. The approach introduces a constant for the M parameter (node connections) and doubles the size calculation to approximate additional HNSW layers.

This summary was automatically generated by @propel-code-bot

rust/index/src/hnsw_provider.rs

entelligence-ai-reviews · 2025-05-29T18:34:00Z

Walkthrough

This update refines the memory usage estimation logic in the weight method of the HnswIndexRef implementation. The new approach accounts for both the L0 graph structure and the embeddings, using a more comprehensive calculation that includes deleted elements and multiplies the total by 2 to estimate higher graph levels. This results in more accurate cache weight calculations for HNSW indices, improving resource management.

Changes

File(s)	Summary
rust/index/src/hnsw_provider.rs	Updated the `weight` method for `HnswIndexRef` to improve memory usage estimation by considering both L0 graph and embeddings, using `index.len_with_deleted()` for element count, and multiplying by 2 for higher graph levels.

Sequence Diagram

This diagram shows the interactions between components:

sequenceDiagram
    title HnswIndexRef Weight Calculation Flow
    
    participant Client
    participant HnswIndexRef
    participant IndexInner
    
    Client->>HnswIndexRef: weight()
    activate HnswIndexRef
    
    HnswIndexRef->>IndexInner: read()
    activate IndexInner
    IndexInner-->>HnswIndexRef: index (RwLockReadGuard)
    deactivate IndexInner
    
    Note over HnswIndexRef: Check if index is empty
    HnswIndexRef->>IndexInner: len_with_deleted()
    activate IndexInner
    IndexInner-->>HnswIndexRef: count (including deleted items)
    deactivate IndexInner
    
    alt index is empty (len_with_deleted() == 0)
        HnswIndexRef-->>Client: return 1
    else index has elements
        HnswIndexRef->>IndexInner: len_with_deleted()
        activate IndexInner
        IndexInner-->>HnswIndexRef: count (including deleted items)
        deactivate IndexInner
        
        HnswIndexRef->>IndexInner: dimensionality()
        activate IndexInner
        IndexInner-->>HnswIndexRef: dimensions
        deactivate IndexInner
        
        Note over HnswIndexRef: Calculate memory usage
        Note over HnswIndexRef: 1. Graph bytes = M * sizeof(u32) * len_with_deleted()
        Note over HnswIndexRef: 2. Embedding bytes = len_with_deleted() * sizeof(f32) * dimensionality
        Note over HnswIndexRef: 3. Total bytes = 2 * (graph_bytes + embedding_bytes)
        Note over HnswIndexRef: 4. Convert to MB
        
        alt as_mb == 0
            HnswIndexRef-->>Client: return 1
        else as_mb > 0
            HnswIndexRef-->>Client: return as_mb
        end
    end
    
    deactivate HnswIndexRef

▶️ ⚡ AI Code Reviews for VS Code, Cursor, Windsurf
Install the extension

Note for Windsurf
Please change the default marketplace provider to the following in the windsurf settings:
Marketplace Extension Gallery Service URL: https://marketplace.visualstudio.com/_apis/public/gallery
Marketplace Gallery Item URL: https://marketplace.visualstudio.com/items

Entelligence.ai can learn from your feedback. Simply add 👍 / 👎 emojis to teach it your preferences. More shortcuts below

Emoji Descriptions:

⚠️ Potential Issue - May require further investigation.
🔒 Security Vulnerability - Fix to ensure system safety.
💻 Code Improvement - Suggestions to enhance code quality.
🔨 Refactor Suggestion - Recommendations for restructuring code.
ℹ️ Others - General comments and information.

Interact with the Bot:

Send a message or request using the format:
@entelligenceai + *your message*

Example: @entelligenceai Can you suggest improvements for this code?

Help the Bot learn by providing feedback on its responses.
@entelligenceai + *feedback*

Example: @entelligenceai Do not comment on `save_auth` function !

Also you can trigger various commands with the bot by doing
@entelligenceai command

The current supported commands are

config - shows the current config
retrigger_review - retriggers the review

More commands to be added soon.

[ENH]: More precise calculation of hnsw cache item weight

a336b28

sanketkedia requested review from HammadB and Sicheng-Pan May 28, 2025 21:22

sanketkedia marked this pull request as ready for review May 28, 2025 21:22

Sicheng-Pan approved these changes May 28, 2025

View reviewed changes

HammadB reviewed May 28, 2025

View reviewed changes

rust/index/src/hnsw_provider.rs Outdated Show resolved Hide resolved

Review comment

17d70ee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH]: More precise calculation of hnsw cache item weight #4657

[ENH]: More precise calculation of hnsw cache item weight #4657

Uh oh!

sanketkedia commented May 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 28, 2025

Uh oh!

sanketkedia commented May 28, 2025

Uh oh!

propel-code-bot bot commented May 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

entelligence-ai-reviews commented May 29, 2025

Emoji Descriptions:

Interact with the Bot:

Uh oh!

Uh oh!

[ENH]: More precise calculation of hnsw cache item weight #4657

Are you sure you want to change the base?

[ENH]: More precise calculation of hnsw cache item weight #4657

Uh oh!

Conversation

sanketkedia commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Test plan

Documentation Changes

Uh oh!

github-actions bot commented May 28, 2025

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality

Uh oh!

sanketkedia commented May 28, 2025

Uh oh!

propel-code-bot bot commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

entelligence-ai-reviews commented May 29, 2025

Walkthrough

Changes

Sequence Diagram

Emoji Descriptions:

Interact with the Bot:

Uh oh!

Uh oh!

sanketkedia commented May 28, 2025 •

edited

Loading

propel-code-bot bot commented May 28, 2025 •

edited

Loading