Skip to content

Propose vector-set API #2939

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open

Propose vector-set API #2939

wants to merge 26 commits into from

Conversation

mgravell
Copy link
Collaborator

@mgravell mgravell commented Aug 11, 2025

as per https://redis.io/docs/latest/develop/data-types/vector-sets/

  • all methods/types start VectorSet...
  • all core methods implemented
  • the only "unusual" one is VLINKS; the server returns this as nested data, but the nesting is not meaningful to the caller (instead being related to the server core), so I've flattened the result
  • usage is shown in VectorSetIntegrationTests, in particular VectorSetSimilaritySearch_WithFilter is useful for overview - since the main two primary APIs are VADD and VSIM
  • since vector set data can be non-trivial, all return types lean on Lease<T> rather than T[]. Inputs are ReadOnlyMemory<T>; the exception to this is string data for JSON and filters; these are explicitly text, never blobs, so RedisValue seems inappropriate. I think forcing string is OK here
  • in acknowledgement that some of our files are too large, I've started splitting the VectorSet* bits out via partial files; I will follow this up with "everything else" (.Strings.cs, .Hashes.cs) in a separate PR

This PR also introduces the start of a new literal-matching API, ala FastHash. It is intended that this will be extended at a later date.

CI: may be dependent on the vectorset module; if it fails, I'll add suitable validation. All VectorSetTests pass locally:

image

@mgravell mgravell marked this pull request as draft August 11, 2025 13:51
@mgravell mgravell marked this pull request as ready for review August 13, 2025 15:57
@mgravell
Copy link
Collaborator Author

mgravell commented Aug 14, 2025

Additional thoughts on FastHash:

  • if length <= 8, we can skip equality test
  • consider hashing first 16 (status: considered and deferred)
  • add unit test that shows interesting known values of different lengths (and different values at same length)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant