Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to raft-ann-bench scripts, docs, and benchmarking implementations. #1769

Merged
merged 69 commits into from
Aug 30, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
bd738ec
ANN-benchmarks: switch to use gbench
achirkin Aug 9, 2023
7473c62
Disable NVTX if the nvtx3 headers are missing
achirkin Aug 9, 2023
aa10d7c
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 10, 2023
bed126c
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 10, 2023
09ea7a7
Merge remote-tracking branch 'upstream/branch-23.10' into python-ann-…
divyegala Aug 11, 2023
2917886
try to run gbench executable
divyegala Aug 12, 2023
49732b1
Allow to compile ANN_BENCH without CUDA
achirkin Aug 17, 2023
76cfb40
Merge remote-tracking branch 'rapidsai/branch-23.10' into enh-google-…
achirkin Aug 17, 2023
9b588af
Fix style
achirkin Aug 17, 2023
6d6c17d
Adapt ANN benchmark python scripts
achirkin Aug 17, 2023
b89b27d
Make the default behavior to produce one executable per benchmark
achirkin Aug 17, 2023
163a40c
Fix style problems / pre-commit
achirkin Aug 17, 2023
0bb51a3
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 22, 2023
2b9f649
Merge remote-tracking branch 'rapidsai/branch-23.10' into enh-google-…
achirkin Aug 23, 2023
9728f7e
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 24, 2023
7b1bf01
Merge remote-tracking branch 'origin/branch-23.10' into enh-google-be…
cjnolet Aug 24, 2023
1daf2bf
Adding k and batch-size options to run.py
cjnolet Aug 24, 2023
4e0a53e
Merge branch 'branch-23.10' - CONFIGS ONLY - dataset_memtype follows …
achirkin Aug 24, 2023
04893c9
Add dataset_memory_type/query_memory_type as build/search parameters
achirkin Aug 24, 2023
b24fcf7
middle of merge, not building
divyegala Aug 24, 2023
30f7467
Tuning guide
cjnolet Aug 24, 2023
3e35121
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet Aug 24, 2023
f927f69
compiling, index building successful, search failing
divyegala Aug 24, 2023
404cd10
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala Aug 24, 2023
0eaa7e0
Fix FAISS using a destroyed stream from previous benchmark case
achirkin Aug 25, 2023
9896963
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet Aug 25, 2023
4062d6f
Fixing issue in conf file and stubbing out parameter tuning guide
cjnolet Aug 25, 2023
7141c21
Adding CAGRA to tuning guide
cjnolet Aug 25, 2023
7c42a78
Adding ivf-flat description to tuning guide
cjnolet Aug 25, 2023
92a37a8
Updating ivf-flat and ivf-pq
cjnolet Aug 25, 2023
3982840
Adding tuning guide tables for ivf-flat and ivf-pq for faiss and raft
cjnolet Aug 25, 2023
d2bfc11
Reatio is not required
cjnolet Aug 25, 2023
80482fb
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala Aug 25, 2023
82f195e
write build,search results
divyegala Aug 25, 2023
0cf1c6f
CLeaning up a couple configs
cjnolet Aug 25, 2023
617c60f
add tuning guide for cagra, modify build param
divyegala Aug 25, 2023
3948f0c
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala Aug 26, 2023
74c9a1b
remove data_export, use gbench csvs to plot
divyegala Aug 26, 2023
902f9f4
fix typo in docs path for results
divyegala Aug 26, 2023
9b82f85
Merge pull request #2 from divyegala/python-ann-bench-use-gbench
cjnolet Aug 26, 2023
1198e1a
for plotting, pick up recall/qps from anywhere in the csv columns
divyegala Aug 26, 2023
24c1619
Merge remote-tracking branch 'divye/python-ann-bench-use-gbench' into…
cjnolet Aug 26, 2023
3f647c3
add output-filepath for plot.py
divyegala Aug 26, 2023
354287d
fix typo in docs
divyegala Aug 26, 2023
e0dfbab
Reverting changes to deep-100M
cjnolet Aug 26, 2023
bb3a194
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 28, 2023
7d8ee13
FAISS refinement
cjnolet Aug 28, 2023
1720e11
Merge branch 'enh-google-benchmarks' of github.com:cjnolet/raft into …
cjnolet Aug 28, 2023
49fd31d
adding build time plot
divyegala Aug 28, 2023
be3da1a
merging corey's upstream
divyegala Aug 28, 2023
b9e7771
Merge pull request #4 from divyegala/python-ann-bench-use-gbench
cjnolet Aug 28, 2023
8e5ab5d
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet Aug 29, 2023
f331a94
Merge branch 'enh-google-benchmarks' of github.com:cjnolet/raft into …
cjnolet Aug 29, 2023
e420593
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 29, 2023
913dec2
Move the 'dump_parameters' earlier in the benchmarks to have higher c…
achirkin Aug 29, 2023
8861fc8
Implementing some of the review feedback
cjnolet Aug 29, 2023
2f52b02
Bench ann
cjnolet Aug 29, 2023
c28326c
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet Aug 29, 2023
720606b
revert dlopen path logic
divyegala Aug 29, 2023
b869e9e
add pandas, minor docs update
divyegala Aug 29, 2023
9643327
add data_export.py to convert json to csv
divyegala Aug 29, 2023
d1877c7
merging corey's branch
divyegala Aug 29, 2023
ef7b7f9
update docs
divyegala Aug 29, 2023
39dd3f4
Merge pull request #5 from divyegala/python-ann-bench-use-gbench
cjnolet Aug 29, 2023
d0afbff
Apply suggestions from code review (fix the #1661 merge-related typos)
achirkin Aug 30, 2023
c719e52
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 30, 2023
78b132b
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 30, 2023
a45b646
Merge pull request #6 from achirkin/enh-google-benchmarks
cjnolet Aug 30, 2023
1df15ab
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Adding ivf-flat description to tuning guide
  • Loading branch information
cjnolet committed Aug 25, 2023
commit 7c42a78cb7e57532e37d75fdfd7b6f18d4d7f1e8
1 change: 1 addition & 0 deletions conda/environments/bench_ann_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ channels:
- conda-forge
- nvidia
dependencies:
- benchmark>=1.8.2
- c-compiler
- clang-tools=16.0.1
- clang=16.0.1
Expand Down
1 change: 1 addition & 0 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ dependencies:
- glog>=0.6.0
- h5py>=3.8.0
- libfaiss>=1.7.1
- benchmark>=1.8.2
- faiss-proc=*=cuda
- matplotlib
- pyyaml
Expand Down
4 changes: 4 additions & 0 deletions docs/source/ann_benchmarks_param_tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ This guide outlines the various parameter settings that can be specified in [RAF

### IVF-Flat

IVF-flat uses an inverted-file index, which partitions the vectors into a series of clusters, or lists, storing them in an interleaved format which is optimized for fast distance computation. The searching of an IVF-flat index reduces the total vectors in the index to those within some user-specified nearest clusters called probes.

IVF-flat is a simple algorithm which won't save any space, but it provides competitive search times even at higher levels of recall.

| Parameter | Type | Data Type | Description |
|-----------|----------------|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `nlists` | `build_param` | Positive Integer `>0` | Number of clusters to partition the vectors into. Larger values will put less points into each cluster but this will impact index build time as more clusters need to be trained. |
Expand Down
2 changes: 2 additions & 0 deletions docs/source/raft_ann_benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,8 @@ options:
--algorithms ALGORITHMS
run only comma separated list of named algorithms (default: None)
--indices INDICES run only comma separated list of named indices. parameter `algorithms` is ignored (default: None)
-k, --count number of nearest neighbors to return
--batch-size number of query vectors to pass into search
-f, --force re-run algorithms even if their results already exist (default: False)
```
`configuration` and `dataset` : `configuration` is a path to a configuration file for a given dataset.
Expand Down