-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] [k-NN] Lucene Engine with SIMD support #1062
Comments
I have done more testing regarding k-NN feature with different platforms and data set. SIMD improves both indexing and query latency significantly across various data dimensions and sizes. Cluster configuration: 3 leader nodes (c5.xlarge), 1 data nodes (r5.8xlarge, for arm r6g.8xlarge is used), 16 shards Data: 128 dimendions, 62.5M
Query latency
Time taken to force merge after indexing
Data: 768 dimendions, 10M
Query latency
|
@heemin32 Thanks for this analysis! Do by any chance have merge time for the second 768-Dim dataset? Also if you have some code that lets us replicate these benchmarks that would be really helpful! |
For 768d, didn't triggered force merge manually so don't have same data as 128d. However, Here is merge related metrics from benchmark test itself. I think I used https://github.com/opensearch-project/k-NN/tree/main/benchmarks/osb tool. Cluster setting is as follow. SettingCluster ConfigurationOS Version | 2.9 Cluster SettingsIndex thread qty | 1 Index Settingsrefresh interval | 60 Data setname | BIGANN Benchmark client (1 per cluster)Machine type | c5.4xlarge Indexing workloadnum_segments to force merge to | 1 Search workloadqueries per client | 100,000 Resultlucene
lucene-simd
faiss
nmslib
lucene-arm
lucene-arm-simd
|
Thanks so much for the detailed response, this is very helpful! |
Update to JDK21 is completed in OpenSearch 2.12. opensearch-project/OpenSearch#11003 |
Is your feature request related to a problem?
Related to opensearch-project/OpenSearch#9423
The text was updated successfully, but these errors were encountered: