-
-
Notifications
You must be signed in to change notification settings - Fork 122
feat(vectors): making vector similarity configurable #436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds configurable vector similarity methods to the helix-db vector engine. Previously, the system was limited to cosine similarity with feature flags, but now supports configurable similarity methods including cosine distance, cosine similarity, and Euclidean distance.
- Introduces a
SimilarityMethod
enum with three variants: CosineDistance, CosineSimilarity (default), and EuclideanDistance - Updates the vector distance calculation API to accept similarity method parameters
- Removes the
cosine
feature flag and related conditional compilation directives
Reviewed Changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
helix-db/src/helix_engine/traversal_core/config.rs | Adds SimilarityMethod enum and integrates it into vector configuration |
helix-db/src/helix_engine/vector_core/vector_distance.rs | Updates distance calculation to support multiple similarity methods |
helix-db/src/helix_engine/vector_core/vector_core.rs | Integrates similarity method into VectorCore constructor and usage |
helix-db/src/helix_engine/vector_core/vector.rs | Updates HVector distance_to method to accept similarity method parameter |
helix-db/Cargo.toml | Removes cosine feature flag from feature definitions |
Multiple test files | Updates test calls to use new distance calculation API with similarity method |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
let c = x - y; | ||
c * c | ||
}) | ||
.sum()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The euclidean_distance function returns the squared Euclidean distance, not the actual Euclidean distance. It should return the square root of the sum: Ok(from.iter().zip(to.iter()).map(|(x, y)| { let c = x - y; c * c }).sum::<f64>().sqrt())
.sum()) | |
.sum::<f64>() | |
.sqrt()) |
Copilot uses AI. Check for mistakes.
pub mcp: Option<bool>, | ||
pub bm25: Option<bool>, | ||
pub schema: Option<String>, | ||
pub embedding_model: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of graphvis_node_label
field appears throughout multiple files but the field is still referenced in line 64. This creates an inconsistent state where the field exists in some structs but not others.
Copilot uses AI. Check for mistakes.
…fferent return configurations, wrapped in a single enum that implements different ordering based on similarity calculation method
…ods, v from id/type (#438) ## Description <!-- Provide a brief description of the changes in this PR --> ## Related Issues <!-- Link to any related issues using #issue_number --> Closes #433 #436 #435 ## Checklist when merging to main <!-- Mark items with "x" when completed --> - [ ] No compiler warnings (if applicable) - [ ] Code is formatted with `rustfmt` - [ ] No useless or dead code (if applicable) - [ ] Code is easy to understand - [ ] Doc comments are used for all functions, enums, structs, and fields (where appropriate) - [ ] All tests pass - [ ] Performance has not regressed (assuming change was not to fix a bug) - [ ] Version number has been updated in `helix-cli/Cargo.toml` and `helixdb/Cargo.toml` ## Additional Notes <!-- Add any additional information that would be helpful for reviewers -->
Description
Related Issues
Closes #
Checklist when merging to main
rustfmt
helix-cli/Cargo.toml
andhelixdb/Cargo.toml
Additional Notes