Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added dist functions from usearch #46

Merged
merged 10 commits into from
Aug 18, 2023

Conversation

var77
Copy link
Collaborator

@var77 var77 commented Aug 12, 2023

Description

Added new dist functions cos_dist and hamming_dist which will use usearch implementations you can refer to this PR for more information.
l2sq_dist function now will also use usearch implementation.

TODO

  • Create new update file for sql
  • Detect dist function when used with index, so we can set corresponding metric type on usearch options without creating several operators
  • Add generic implementation of dist_func to reject usage outside of index
  • Add tests to cover functionality

Issue

closes #34

@var77 var77 added the WIP label Aug 12, 2023
@var77 var77 requested a review from Ngalstyan4 August 12, 2023 05:31
@var77 var77 marked this pull request as ready for review August 12, 2023 09:13
@var77 var77 removed the WIP label Aug 12, 2023
Copy link
Contributor

@Ngalstyan4 Ngalstyan4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

test/sql/generic_distance_metric.sql Outdated Show resolved Hide resolved
src/hnsw.c Outdated Show resolved Hide resolved
@var77 var77 requested a review from Ngalstyan4 August 13, 2023 14:14
Copy link
Contributor

@Ngalstyan4 Ngalstyan4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test file with usage tests

  • Create index with each op class and add some rows
  • Use the wrong distance function in an order by and assure the index is not used (via explain)

test/sql/hnsw_insert_array.sql Show resolved Hide resolved
sql/updates/0.0.3--latest.sql Show resolved Hide resolved
@var77 var77 requested a review from Ngalstyan4 August 14, 2023 15:01
Copy link
Contributor

@Ngalstyan4 Ngalstyan4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing!

src/hnsw.c Show resolved Hide resolved
src/hnsw.h Outdated Show resolved Hide resolved
src/hnsw/options.c Outdated Show resolved Hide resolved
src/hnsw/options.c Show resolved Hide resolved
test/expected/hnsw.out Show resolved Hide resolved
test/expected/hnsw_dist_func.out Outdated Show resolved Hide resolved
test/expected/hnsw_dist_func.out Show resolved Hide resolved
Copy link
Contributor

@Ngalstyan4 Ngalstyan4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another pass over this.

sql/lanterndb.sql Outdated Show resolved Hide resolved
sql/lanterndb.sql Outdated Show resolved Hide resolved
sql/lanterndb.sql Outdated Show resolved Hide resolved
sql/updates/0.0.3--latest.sql Outdated Show resolved Hide resolved
src/hnsw.h Show resolved Hide resolved
src/hnsw/utils.c Outdated Show resolved Hide resolved
src/hnsw/utils.c Outdated Show resolved Hide resolved
test/sql/hnsw_dist_func.sql Show resolved Hide resolved
test/sql/hnsw_dist_func.sql Show resolved Hide resolved

CREATE TABLE small_world_l2_vec (
id varchar(3),
vector vector(3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to vactor this into a separate file (e.g. pgvector_compat.sql)
We will need to then move all the other tests that require pgvector into this file so we could make pgvector dependency optional for running tests as well.

@Ngalstyan4 Ngalstyan4 merged commit e823294 into lanterndata:main Aug 18, 2023
11 checks passed
var77 added a commit that referenced this pull request Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a new distance function to LanternDB index
2 participants