-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Feature/scalar quantized off heap scoring #13497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/scalar quantized off heap scoring #13497
Conversation
…tized-off-heap-scoring
Half-byte is showing up as measurably slower with this change. Candidate:
baseline:
Full-byte is slightly faster candidate:
baseline:
|
are you reporting indexing times? query times? |
Query times, single segment, 10k docs of 1024 dims. |
…tized-off-heap-scoring
Ok, I double checked, and indeed, half-byte is way slower when reading directly from memory segments instead of reading on heap. The flamegraphs are wildly different. So much more time is being spent reading from memory segment and then comparing the vectors baseline: ![]() |
@ChrisHegarty have you seen a significant performance regression on MemorySegments & JDK22? Doing some testing, I updated my performance testing for this PR to use JDK22 and now it is WAY slower, more than 2x slower, even for full-byte. For int7, this branch is marginally faster (20%) with JDK21, but basically 2x slower on JDK22. I wonder if our off-heap scoring for |
To verify it wasn't some weird artifact in my code, I slightly changed it to where my execution path always reads the vectors on-heap and then wraps them in a memorysegment. Now JDK22 performs the same as JDK21 & the current baseline. Its weird to me that reading from a memory segment onto ByteVector objects would be 2x slower on JDK22 than 21. Regardless that its already much slower for the int4 case on both jdk 21 & 22. |
@benwtrent I was not aware, lemme take a look. |
…tized-off-heap-scoring
@kaivalnp feel free to take my initial work here and dig in deeper. I haven't benchmarked it recently on later JVMs to figure out why I was experiencing such a weird slowdown when going off heap :/ |
Thanks @benwtrent! I opened #14863 |
This adds off-heap scoring for our scalar quantization.
Opening as DRAFT as I still haven't fully tested out the performance characteristics. Opening early for discussion.