Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory overflow caused by cache behavior #2015

Merged
merged 1 commit into from
Sep 4, 2024

Conversation

kotwanikunal
Copy link
Member

Description

  • Adds in a force evict mechanism before entry to prevent memory overflow for native memory usage within the cache
  • This solution works in tandem with jemalloc enabled. In case of default malloc, there are potential delays with memory free up operations leading to similar scenarios
  • The solution has been tested and benchmarked - described in the Testing section below

Problem

  • The problem stems from the behavior of the cache where in entries of the cache might not be cleaned up as soon as they are evictable
  • This delay between new entry and cleanup can, in our case, cause the memory to overflow - leading to a node failure

Solution

  • This solution aims at maintaining an additional recency list within the cache manager
  • On every get call -
    • in case of a cache hit - it just marks the key as recent, returns the value
    • in case of a cache miss - it adds the key to the recency list, loads the item into the cache and returns
    • in case of a cache miss & a potential overflow - eviction process kicks off before the item is loaded into the cache

Testing

  1. Case: Enough memory available
  • Scenario description: Indexed 1M 768d vectors on a 3 node cluster with 16GB memory - 8GB heap, 8GB for graph files, ran an OSB query load with and without the fix
  • TLDR; Not a ton of deviation from current code in terms of raw numbers. Results below
Index Type Dimension Number Of Vectors Min Throughput Mean Throughput Median Throughput Max Throughput 50th percentile latency 90th percentile latency 99th percentile latency 100th percentile latency 50th percentile service time 90th percentile service time 99th percentile service time 100th percentile service time Mean recall@k Mean recall@1
Current Code 768 1,000,000 2.46 2.68 2.69 2.74 358.268 382.036 414.636 482.332 358.268 382.036 414.636 482.332 0.98 1
Code with Fix 768 1,000,000 2.55 2.71 2.7 2.75 363.62 398.338 432.811 447.278 363.62 398.338 432.811 447.278 0.98 1
  1. Case: Heavy query load scenario
  • Scenario: Created 3 indexes (2 FAISS, 1 NMSLIB), Indexed 1M 768d vectors on a 3 node cluster with 16GB memory - 8GB heap, 8GB for graph files, ran 3 parallel query workloads for all the indexes at once.
  • TLDR; Without the fix - noticed node crashes. With the fix - there was an uptick in error due to timeouts, latency increased but no node crash was observed
Index Type Dimension Number Of Vectors Min Throughput Mean Throughput Median Throughput Max Throughput 50th percentile latency 90th percentile latency 99th percentile latency 100th percentile latency 50th percentile service time 90th percentile service time 99th percentile service time 100th percentile service time Error rate % Mean recall@k Mean recall@1
Index1 with fix 768 1,000,000 0.05 0.06 0.06 0.09 13228.9 19363.3 24602.8 35710.1 13228.9 19363.3 24602.8 35710.1 11 0.99 1
Index2 with fix 768 1,000,000 0.1 0.25 0.14 1.02 11728.5 15634.2 23228.2 34283.1 11728.5 15634.2 23228.2 34283.1 0 0.98 1
Index3 with fix 768 1,000,000 0.06 0.07 0.07 0.09 12575 17641.8 25928.4 27722.8 12575 17641.8 25928.4 27722.8 3 0.98 1

Related Issues

Partially resolves #1582 (Case 3)

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@navneet1v navneet1v added Enhancements Increases software capabilities beyond original client specifications search-improvements labels Sep 3, 2024
@navneet1v navneet1v requested a review from heemin32 September 3, 2024 00:28
@kotwanikunal
Copy link
Member Author

kotwanikunal commented Sep 3, 2024

Original Test results:

  1. Case: Enough memory available
  • Scenario description: Indexed 1M 768d vectors on a 3 node cluster with 16GB memory - 8GB heap, 8GB for graph files, ran an OSB query load with and without the fix
  • TLDR; Not a ton of deviation from current code in terms of raw numbers. Results below
Index Type Dimension Number Of Vectors Min Throughput Mean Throughput Median Throughput Max Throughput 50th percentile latency 90th percentile latency 99th percentile latency 100th percentile latency 50th percentile service time 90th percentile service time 99th percentile service time 100th percentile service time Mean recall@k Mean recall@1
Current Code 768 1,000,000 2.46 2.68 2.69 2.74 358.268 382.036 414.636 482.332 358.268 382.036 414.636 482.332 0.98 1
Code with Fix 768 1,000,000 2.55 2.71 2.7 2.75 363.62 398.338 432.811 447.278 363.62 398.338 432.811 447.278 0.98 1
  1. Case: Heavy query load scenario
  • Scenario: Created 3 indexes (2 FAISS, 1 NMSLIB), Indexed 1M 768d vectors on a 3 node cluster with 16GB memory - 8GB heap, 8GB for graph files, ran 3 parallel query workloads for all the indexes at once.
  • TLDR; Without the fix - noticed node crashes. With the fix - there was an uptick in error due to timeouts, latency increased but no node crash was observed
Index Type Dimension Number Of Vectors Min Throughput Mean Throughput Median Throughput Max Throughput 50th percentile latency 90th percentile latency 99th percentile latency 100th percentile latency 50th percentile service time 90th percentile service time 99th percentile service time 100th percentile service time Error rate % Mean recall@k Mean recall@1
Index1 with fix 768 1,000,000 0.05 0.06 0.06 0.09 13228.9 19363.3 24602.8 35710.1 13228.9 19363.3 24602.8 35710.1 11 0.99 1
Index2 with fix 768 1,000,000 0.1 0.25 0.14 1.02 11728.5 15634.2 23228.2 34283.1 11728.5 15634.2 23228.2 34283.1 0 0.98 1
Index3 with fix 768 1,000,000 0.06 0.07 0.07 0.09 12575 17641.8 25928.4 27722.8 12575 17641.8 25928.4 27722.8 3 0.98 1

Ran some tests for LinkedHashSet vs the current queue implementation. Results:

Index Type Dimension Number Of Vectors Min Throughput Mean Throughput Median Throughput Max Throughput 50th percentile latency 90th percentile latency 99th percentile latency 100th percentile latency Error rate % Mean recall@k Mean recall@1
Single Query Queue 768 1,000,000 1.28 2.37 2.46 2.58 368.768 417.358 483.694 761.651 0 0.98 0.97
Single Query Sync Set 768 1,000,000 1.1 1.97 2.07 2.3 407.133 516.032 658.99 889.347 0 0.98 0.97
Parallel Query Queue 768 1,000,000 0.07 0.07 0.07 0.09 13465.6 18455.7 31858 33136.6 0 0.99 1
Parallel Query Sync Set 768 1,000,000 0.06 0.07 0.06 0.08 13937 18936.5 28190.5 28804.4 8 0.99 1

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Copy link
Collaborator

@navneet1v navneet1v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the PR. I do understand that by adding updating element in the queue is not fully optimized but as the solution provides more stability(in memory constraint situation)I am good with the approach.

I would suggest turning on the feature flag so that we can see the impact of this in the nightlies. But I will leave this upto the author.

Copy link
Collaborator

@heemin32 heemin32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving it as this is behind feature flag and it is false by default. We might need a test with higher number of segments.

@navneet1v navneet1v merged commit d4af93e into opensearch-project:main Sep 4, 2024
28 of 29 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-2015-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d4af93edfb45878fdb0a442f4ac07e091b5ad14b
# Push it to GitHub
git push --set-upstream origin backport/backport-2015-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-2015-to-2.x.

kotwanikunal added a commit to kotwanikunal/k-NN that referenced this pull request Sep 4, 2024
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
(cherry picked from commit d4af93e)
@kotwanikunal
Copy link
Member Author

Backport: #2032

naveentatikonda pushed a commit that referenced this pull request Sep 4, 2024
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
(cherry picked from commit d4af93e)
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 4, 2024
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
(cherry picked from commit d4af93e)
(cherry picked from commit 5858d1c)
navneet1v pushed a commit that referenced this pull request Sep 4, 2024
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
(cherry picked from commit d4af93e)
(cherry picked from commit 5858d1c)

Co-authored-by: Kunal Kotwani <kkotwani@amazon.com>
akashsha1 pushed a commit to akashsha1/k-NN that referenced this pull request Sep 16, 2024
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Signed-off-by: Akash Shankaran <akash.shankaran@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Enhancements Increases software capabilities beyond original client specifications search-improvements v2.17.0
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Investigate rearchitecture of the native memory circuit breaker
4 participants