Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ANNSearcher class for easier ANN usage from code #1078

Merged
merged 7 commits into from
Apr 4, 2020

Conversation

tteofili
Copy link
Collaborator

@tteofili tteofili commented Apr 3, 2020

The changes involve creation of ANNSearcher class (similar to what SimpleSearcher does) for easier execution of approximate nearest neighbor search from code (e.g. from pyserini).
The PR also includes some minor changes to the original ANN code for eventually storing full vectors within the index (instead of requiring the text embedding model to be available at search time, as it is today).
Finally some minor adjustments to defaults for ANN analyzers, to make them consistent over the codebase.

@codecov
Copy link

codecov bot commented Apr 3, 2020

Codecov Report

Merging #1078 into master will increase coverage by 0.34%.
The diff coverage is 88.23%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1078      +/-   ##
============================================
+ Coverage     45.00%   45.35%   +0.34%     
- Complexity      660      675      +15     
============================================
  Files           135      136       +1     
  Lines          8039     8092      +53     
  Branches       1149     1160      +11     
============================================
+ Hits           3618     3670      +52     
+ Misses         4098     4095       -3     
- Partials        323      327       +4     
Impacted Files Coverage Δ Complexity Δ
...a/io/anserini/ann/fw/FakeWordsEncoderAnalyzer.java 100.00% <ø> (+18.18%) 3.00 <0.00> (+1.00)
...ava/io/anserini/ann/lexlsh/LexicalLshAnalyzer.java 91.66% <ø> (ø) 5.00 <0.00> (ø)
...anserini/ann/ApproximateNearestNeighborSearch.java 75.00% <83.33%> (ø) 11.00 <0.00> (+4.00)
src/main/java/io/anserini/ann/IndexVectors.java 79.31% <83.33%> (-0.93%) 10.00 <0.00> (ø)
...anserini/search/SimpleNearestNeighborSearcher.java 94.44% <94.44%> (ø) 7.00 <7.00> (?)
...o/anserini/ann/ApproximateNearestNeighborEval.java 79.27% <100.00%> (ø) 13.00 <0.00> (ø)
...java/io/anserini/ltr/feature/CountBigramPairs.java 94.80% <0.00%> (+5.19%) 36.00% <0.00%> (+3.00%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc42ce2...c3f8b67. Read the comment docs.

@tteofili tteofili requested a review from lintool April 3, 2020 18:03
@lintool
Copy link
Member

lintool commented Apr 4, 2020

@tteofili bunch of comments for you!

@lintool lintool merged commit 6f4b9bf into castorini:master Apr 4, 2020
crystina-z pushed a commit to crystina-z/anserini that referenced this pull request Oct 28, 2022
sync unicoil topics in anserini
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants