New results #399

erikbern · 2023-04-27T02:26:46Z

See new plots

erikbern · 2023-04-27T02:28:24Z

Will add a couple more datasets to this shortly!

thomasahle · 2023-04-27T06:02:02Z

Oh no, looks like fast_pq/tinyknn regressed substantially since #363 (comment) and is completely broken on mnist.
Are these new runs final? Or do we still have time to fix bugs?

erikbern · 2023-04-27T13:30:27Z

Oh no, looks like fast_pq/tinyknn regressed substantially since #363 (comment) and is completely broken on mnist.
Are these new runs final? Or do we still have time to fix bugs?

If you fix it in the next few hours, then I can re-run it. But I want to tear down the r6.16xlarge instance soon (it's ~$100/day)

thomasahle · 2023-04-27T17:34:24Z

Is the run actually using the latest ann-benchmarks code? It seems the string fast_pq should have been replaced in the repo

erikbern · 2023-04-27T18:21:36Z

Is the run actually using the latest ann-benchmarks code? It seems the string fast_pq should have been replaced in the repo

I just realize I haven't re-run install.py in the last few days. Let me wipe the fast_pq data and re-run all benchmarks (will also wipe opensearchknn)

thomasahle · 2023-04-27T20:57:30Z

Oh no, looks like fast_pq/tinyknn regressed substantially since #363 (comment) and is completely broken on mnist.
Are these new runs final? Or do we still have time to fix bugs?

If you fix it in the next few hours, then I can re-run it. But I want to tear down the r6.16xlarge instance soon (it's ~$100/day)

I found the problem. Just testing my fix now. Will push it asap.

thomasahle · 2023-04-27T23:35:32Z

It's pushed. Building the Docker image now should git clone the newest version.

erikbern · 2023-04-28T14:23:07Z

Pushed latest results. Going to run gist-960-euclidean as well just to have a high dimensional dataset that's a bit larger. It will probably take 10h.

WPJiang · 2023-04-28T15:22:30Z

Pushed latest results. Going to run gist-960-euclidean as well just to have a high dimensional dataset that's a bit larger. It will probably take 10h.

hi，@erikbern there is only one line -tinyknn in the new result of glove25, would you please check it?

erikbern · 2023-04-28T15:38:06Z

hi，@erikbern there is only one line -tinyknn in the new result of glove25, would you please check it?

You're right, I accidentally ran glove-25 only for tinyknn. Let me run it for all algos.

thomasahle · 2023-04-28T17:14:11Z

It's strange. This is what I get when I spin up an r6i.8xlarge myself and run tinyknn on sift. Maybe I pushed the update too late, after you had already rerun install.py? Or maybe there's some other issue I can't think of...

erikbern · 2023-04-28T19:07:32Z

It's possible the image didn't rebuild because of Docker layer caching. I didn't check closely. I'll rerun soon though.

thomasahle · 2023-04-28T19:49:07Z

If the docker image times out during querying, is the data gathered still saved? Or is saving the data part of the job being run inside the docker container, so it gets lost? In the later case I can try to reduce the number of query_args to prevent timeouts.
It seems to me that the build-args/groups each have their own 7200s timeout, so I reducing them probably won't make a difference?

erikbern · 2023-04-28T20:00:59Z

the process running inside the container saves the data to data that's mounted into the container from the outside – so if the container is killed after 2h, any data saved up to that point is kept

generally the 2h timeout happens during index building though, not afterwards during query processing. that step is usually pretty fast.

WPJiang · 2023-04-28T21:12:30Z

It's possible the image didn't rebuild because of Docker layer caching. I didn't check closely. I'll rerun soon though.
I think it's possible because that the result file of glove25 already exists when you run, which can be checked by the file date. The following code in main.py will cause the execution of this configuration to be skipped.
for query_arguments in query_argument_groups: fn = get_result_filename(args.dataset, args.count, definition, query_arguments, args.batch) if args.force or not os.path.exists(fn): not_yet_run.append(query_arguments)

thomasahle · 2023-04-28T22:35:24Z

generally the 2h timeout happens during index building though

It seems some of the last args were taking more than half an hour, so I trimmed the list down. Now none of the rounds should take more than a minute or two.
#406

erikbern · 2023-04-29T03:37:53Z

Merging this for now, but I'm planning to run this again in just a week or two. Will also polish the graphs a bit. But I don't want perfection to get in the way of getting something updated out. Won't promote this for now.

New resutls

7e5cc97

erikbern changed the title ~~New resutls~~ New results Apr 27, 2023

erikbern mentioned this pull request Apr 27, 2023

New run, April 2023 #363

Closed

erikbern mentioned this pull request Apr 27, 2023

Add Elasticsearch KNN #401

Merged

updated results

1537b3e

erikbern mentioned this pull request Apr 28, 2023

Better qdrant params #403

Merged

erikbern added 2 commits April 28, 2023 23:25

updated glove-25 and gist-960

9230295

update readme

b023ca8

erikbern merged commit 3d1b2e8 into main Apr 29, 2023

erikbern deleted the erikbern/new-results branch April 29, 2023 03:38

New results #399

New results #399

Uh oh!

Conversation

erikbern commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erikbern commented Apr 27, 2023

Uh oh!

thomasahle commented Apr 27, 2023

Uh oh!

erikbern commented Apr 27, 2023

Uh oh!

thomasahle commented Apr 27, 2023

Uh oh!

erikbern commented Apr 27, 2023

Uh oh!

thomasahle commented Apr 27, 2023

Uh oh!

thomasahle commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erikbern commented Apr 28, 2023

Uh oh!

WPJiang commented Apr 28, 2023

Uh oh!

erikbern commented Apr 28, 2023

Uh oh!

thomasahle commented Apr 28, 2023

Uh oh!

erikbern commented Apr 28, 2023

Uh oh!

thomasahle commented Apr 28, 2023

Uh oh!

erikbern commented Apr 28, 2023

Uh oh!

WPJiang commented Apr 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasahle commented Apr 28, 2023

Uh oh!

erikbern commented Apr 29, 2023

Uh oh!

Uh oh!

erikbern commented Apr 27, 2023 •

edited

Loading

thomasahle commented Apr 27, 2023 •

edited

Loading

WPJiang commented Apr 28, 2023 •

edited

Loading