add tag search #411

caohassl · 2022-09-05T08:11:26Z

I notice the issue（Support "skipping" nodes during traversal? #294） and support it , further supports tag search～

1、when building the index , enable to define the tag of the vector
2、when the build is finished，add a inverted file to store those tags
3、when querying, use the inverted file to skip nodes by tag (or id)
4、add a python file to test

yurymalkov · 2022-09-05T19:23:54Z

Hi @caohassl ,

Thank you for PR! It seems it has some conflict (in terms of functionality) with #402, so we would need to figure out how to merge them.
It also seems that windows tests are failing due to dirent.h

caohassl · 2022-09-06T15:34:11Z

Hi @yurymalkov

Thanks for the reply！
I have read the PR #402，it seems that the ID positive filtering(keeping the nodes we need) has already been supported .

The first scenario is that we need to discard some unnecessary nodes，not only just keep the nodes we need in the scan.
The second scenario is that we also need to find the TOPN that satisfies the particular tag, rather than taking TOPN + 1000 and filtering out 1000

To support ID filtering(positive and negative) and tag filtering，I commit the PR (and fix the ci error)。

dyashuni · 2022-09-09T16:02:05Z

Hi @caohassl, why did you close the PR?

caohassl · 2022-09-10T01:13:32Z

hi @dyashuni

I realized that a lot of filtering may cause performance problem, so I closed the PR.
I will try to add a threshold to jump out the scan earlier, if the performance is good, and then I will reopen it

dyashuni · 2022-09-10T07:12:30Z

@caohassl Thank you. Yes, heavy filtering makes search slow.
I think there is no need to create a new class SearchHNSW because its code is almost the same as the code of the HierarchicalNSW class. You can use the HierarchicalNSW class with a custom implementation of FilterFunctor to avoid code duplication. If serialization is required you can add serialization/deserialization methods to your implementation of FilterFunctor as well.

add tag search

be50839

caomr and others added 2 commits September 6, 2022 21:03

search for win

0a28953

fix ci

6fc10bc

caohassl closed this Sep 9, 2022

caohassl deleted the develop branch September 11, 2022 10:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add tag search #411

add tag search #411

Uh oh!

caohassl commented Sep 5, 2022 •

edited

Loading

Uh oh!

yurymalkov commented Sep 5, 2022

Uh oh!

caohassl commented Sep 6, 2022

Uh oh!

dyashuni commented Sep 9, 2022

Uh oh!

caohassl commented Sep 10, 2022

Uh oh!

dyashuni commented Sep 10, 2022

Uh oh!

Uh oh!

add tag search #411

add tag search #411

Uh oh!

Conversation

caohassl commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yurymalkov commented Sep 5, 2022

Uh oh!

caohassl commented Sep 6, 2022

Uh oh!

dyashuni commented Sep 9, 2022

Uh oh!

caohassl commented Sep 10, 2022

Uh oh!

dyashuni commented Sep 10, 2022

Uh oh!

Uh oh!

caohassl commented Sep 5, 2022 •

edited

Loading