Skip to content

Commit

Permalink
Clean up of HNSW indexing code (#2265)
Browse files Browse the repository at this point in the history
+ Refactored and cleaned up HNSH indexing code
+ Cleaned up logging in test cases (less verbose)
+ Renamed args topicfield to topicField, topicreader to topicReader
+ Renamed LuceneDenseVectorDocumentGenerator to HnswDenseVectorDocumentGenerator for consistency
  • Loading branch information
lintool authored Nov 21, 2023
1 parent 6369184 commit 9d34274
Show file tree
Hide file tree
Showing 426 changed files with 1,895 additions and 1,913 deletions.
8 changes: 4 additions & 4 deletions docs/regressions/regressions-backgroundlinking18.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection WashingtonPostCollection \
-input /path/to/wapo.v2 \
-index indexes/lucene-index.wapo.v2/ \
-generator WashingtonPostGenerator \
-index indexes/lucene-index.wapo.v2/ \
-threads 1 -storePositions -storeDocvectors -storeRaw \
>& logs/log.wapo.v2 &
```
Expand All @@ -45,21 +45,21 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v2/ \
-topics tools/topics-and-qrels/topics.backgroundlinking18.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v2.bm25.topics.backgroundlinking18.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -hits 100 &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v2/ \
-topics tools/topics-and-qrels/topics.backgroundlinking18.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v2.bm25+rm3.topics.backgroundlinking18.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v2/ \
-topics tools/topics-and-qrels/topics.backgroundlinking18.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v2.bm25+rm3+df.topics.backgroundlinking18.txt \
-backgroundlinking -backgroundlinking.datefilter -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
```
Expand Down
8 changes: 4 additions & 4 deletions docs/regressions/regressions-backgroundlinking19.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection WashingtonPostCollection \
-input /path/to/wapo.v2 \
-index indexes/lucene-index.wapo.v2/ \
-generator WashingtonPostGenerator \
-index indexes/lucene-index.wapo.v2/ \
-threads 1 -storePositions -storeDocvectors -storeRaw \
>& logs/log.wapo.v2 &
```
Expand All @@ -45,21 +45,21 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v2/ \
-topics tools/topics-and-qrels/topics.backgroundlinking19.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v2.bm25.topics.backgroundlinking19.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -hits 100 &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v2/ \
-topics tools/topics-and-qrels/topics.backgroundlinking19.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v2.bm25+rm3.topics.backgroundlinking19.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v2/ \
-topics tools/topics-and-qrels/topics.backgroundlinking19.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v2.bm25+rm3+df.topics.backgroundlinking19.txt \
-backgroundlinking -backgroundlinking.datefilter -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
```
Expand Down
8 changes: 4 additions & 4 deletions docs/regressions/regressions-backgroundlinking20.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection WashingtonPostCollection \
-input /path/to/wapo.v3 \
-index indexes/lucene-index.wapo.v3/ \
-generator WashingtonPostGenerator \
-index indexes/lucene-index.wapo.v3/ \
-threads 1 -storePositions -storeDocvectors -storeRaw \
>& logs/log.wapo.v3 &
```
Expand All @@ -45,21 +45,21 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v3/ \
-topics tools/topics-and-qrels/topics.backgroundlinking20.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v3.bm25.topics.backgroundlinking20.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -hits 100 &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v3/ \
-topics tools/topics-and-qrels/topics.backgroundlinking20.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v3.bm25+rm3.topics.backgroundlinking20.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.wapo.v3/ \
-topics tools/topics-and-qrels/topics.backgroundlinking20.txt \
-topicreader BackgroundLinking \
-topicReader BackgroundLinking \
-output runs/run.wapo.v3.bm25+rm3+df.topics.backgroundlinking20.txt \
-backgroundlinking -backgroundlinking.datefilter -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
```
Expand Down
4 changes: 2 additions & 2 deletions docs/regressions/regressions-beir-v1.0.0-arguana-flat-wp.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirFlatCollection \
-input /path/to/beir-v1.0.0-arguana-flat-wp \
-index indexes/lucene-index.beir-v1.0.0-arguana-flat-wp/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-arguana-flat-wp/ \
-threads 1 -storePositions -storeDocvectors -storeRaw -pretokenized \
>& logs/log.beir-v1.0.0-arguana-flat-wp &
```
Expand All @@ -39,7 +39,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-arguana-flat-wp/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.wp.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-arguana-flat-wp.bm25.topics.beir-v1.0.0-arguana.test.wp.txt \
-bm25 -removeQuery -pretokenized &
```
Expand Down
4 changes: 2 additions & 2 deletions docs/regressions/regressions-beir-v1.0.0-arguana-flat.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirFlatCollection \
-input /path/to/beir-v1.0.0-arguana-flat \
-index indexes/lucene-index.beir-v1.0.0-arguana-flat/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-arguana-flat/ \
-threads 1 -storePositions -storeDocvectors -storeRaw \
>& logs/log.beir-v1.0.0-arguana-flat &
```
Expand All @@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-arguana-flat/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-arguana-flat.bm25.topics.beir-v1.0.0-arguana.test.txt \
-bm25 -removeQuery -hits 1000 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirMultifieldCollection \
-input /path/to/beir-v1.0.0-arguana-multifield \
-index indexes/lucene-index.beir-v1.0.0-arguana-multifield/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-arguana-multifield/ \
-threads 1 -storePositions -storeDocvectors -storeRaw -fields title \
>& logs/log.beir-v1.0.0-arguana-multifield &
```
Expand All @@ -39,7 +39,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-arguana-multifield/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-arguana-multifield.bm25.topics.beir-v1.0.0-arguana.test.txt \
-bm25 -removeQuery -hits 1000 -fields contents=1.0 title=1.0 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@ Sample indexing command:
target/appassembler/bin/IndexCollection \
-collection JsonVectorCollection \
-input /path/to/beir-v1.0.0-arguana-splade_distil_cocodenser_medium \
-index indexes/lucene-index.beir-v1.0.0-arguana-splade_distil_cocodenser_medium/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-arguana-splade_distil_cocodenser_medium/ \
-threads 16 -impact -pretokenized \
>& logs/log.beir-v1.0.0-arguana-splade_distil_cocodenser_medium &
```
Expand All @@ -71,7 +71,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-arguana-splade_distil_cocodenser_medium/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.splade_distil_cocodenser_medium.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-arguana-splade_distil_cocodenser_medium.splade_distil_cocodenser_medium.topics.beir-v1.0.0-arguana.test.splade_distil_cocodenser_medium.txt \
-impact -pretokenized -removeQuery -hits 1000 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection JsonVectorCollection \
-input /path/to/beir-v1.0.0-arguana-unicoil-noexp \
-index indexes/lucene-index.beir-v1.0.0-arguana-unicoil-noexp/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-arguana-unicoil-noexp/ \
-threads 16 -impact -pretokenized \
>& logs/log.beir-v1.0.0-arguana-unicoil-noexp &
```
Expand All @@ -42,7 +42,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-arguana-unicoil-noexp/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.unicoil-noexp.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-arguana-unicoil-noexp.unicoil-noexp.topics.beir-v1.0.0-arguana.test.unicoil-noexp.txt \
-impact -pretokenized -removeQuery -hits 1000 &
```
Expand Down
4 changes: 2 additions & 2 deletions docs/regressions/regressions-beir-v1.0.0-bioasq-flat-wp.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirFlatCollection \
-input /path/to/beir-v1.0.0-bioasq-flat-wp \
-index indexes/lucene-index.beir-v1.0.0-bioasq-flat-wp/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-bioasq-flat-wp/ \
-threads 1 -storePositions -storeDocvectors -storeRaw -pretokenized \
>& logs/log.beir-v1.0.0-bioasq-flat-wp &
```
Expand All @@ -39,7 +39,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-bioasq-flat-wp/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.wp.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-bioasq-flat-wp.bm25.topics.beir-v1.0.0-bioasq.test.wp.txt \
-bm25 -removeQuery -pretokenized &
```
Expand Down
4 changes: 2 additions & 2 deletions docs/regressions/regressions-beir-v1.0.0-bioasq-flat.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirFlatCollection \
-input /path/to/beir-v1.0.0-bioasq-flat \
-index indexes/lucene-index.beir-v1.0.0-bioasq-flat/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-bioasq-flat/ \
-threads 1 -storePositions -storeDocvectors -storeRaw \
>& logs/log.beir-v1.0.0-bioasq-flat &
```
Expand All @@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-bioasq-flat/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-bioasq-flat.bm25.topics.beir-v1.0.0-bioasq.test.txt \
-bm25 -removeQuery -hits 1000 &
```
Expand Down
4 changes: 2 additions & 2 deletions docs/regressions/regressions-beir-v1.0.0-bioasq-multifield.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirMultifieldCollection \
-input /path/to/beir-v1.0.0-bioasq-multifield \
-index indexes/lucene-index.beir-v1.0.0-bioasq-multifield/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-bioasq-multifield/ \
-threads 1 -storePositions -storeDocvectors -storeRaw -fields title \
>& logs/log.beir-v1.0.0-bioasq-multifield &
```
Expand All @@ -39,7 +39,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-bioasq-multifield/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-bioasq-multifield.bm25.topics.beir-v1.0.0-bioasq.test.txt \
-bm25 -removeQuery -hits 1000 -fields contents=1.0 title=1.0 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ Sample indexing command:
target/appassembler/bin/IndexCollection \
-collection JsonVectorCollection \
-input /path/to/beir-v1.0.0-bioasq-splade_distil_cocodenser_medium \
-index indexes/lucene-index.beir-v1.0.0-bioasq-splade_distil_cocodenser_medium/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-bioasq-splade_distil_cocodenser_medium/ \
-threads 16 -impact -pretokenized \
>& logs/log.beir-v1.0.0-bioasq-splade_distil_cocodenser_medium &
```
Expand All @@ -72,7 +72,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-bioasq-splade_distil_cocodenser_medium/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.splade_distil_cocodenser_medium.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-bioasq-splade_distil_cocodenser_medium.splade_distil_cocodenser_medium.topics.beir-v1.0.0-bioasq.test.splade_distil_cocodenser_medium.txt \
-impact -pretokenized -removeQuery -hits 1000 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection JsonVectorCollection \
-input /path/to/beir-v1.0.0-bioasq-unicoil-noexp \
-index indexes/lucene-index.beir-v1.0.0-bioasq-unicoil-noexp/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-bioasq-unicoil-noexp/ \
-threads 16 -impact -pretokenized \
>& logs/log.beir-v1.0.0-bioasq-unicoil-noexp &
```
Expand All @@ -42,7 +42,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-bioasq-unicoil-noexp/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.unicoil-noexp.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-bioasq-unicoil-noexp.unicoil-noexp.topics.beir-v1.0.0-bioasq.test.unicoil-noexp.txt \
-impact -pretokenized -removeQuery -hits 1000 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirFlatCollection \
-input /path/to/beir-v1.0.0-climate-fever-flat-wp \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-flat-wp/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-flat-wp/ \
-threads 1 -storePositions -storeDocvectors -storeRaw -pretokenized \
>& logs/log.beir-v1.0.0-climate-fever-flat-wp &
```
Expand All @@ -39,7 +39,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-flat-wp/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.wp.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-climate-fever-flat-wp.bm25.topics.beir-v1.0.0-climate-fever.test.wp.txt \
-bm25 -removeQuery -pretokenized &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirFlatCollection \
-input /path/to/beir-v1.0.0-climate-fever-flat \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-flat/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-flat/ \
-threads 1 -storePositions -storeDocvectors -storeRaw \
>& logs/log.beir-v1.0.0-climate-fever-flat &
```
Expand All @@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-flat/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-climate-fever-flat.bm25.topics.beir-v1.0.0-climate-fever.test.txt \
-bm25 -removeQuery -hits 1000 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection BeirMultifieldCollection \
-input /path/to/beir-v1.0.0-climate-fever-multifield \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-multifield/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-multifield/ \
-threads 1 -storePositions -storeDocvectors -storeRaw -fields title \
>& logs/log.beir-v1.0.0-climate-fever-multifield &
```
Expand All @@ -39,7 +39,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-multifield/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-climate-fever-multifield.bm25.topics.beir-v1.0.0-climate-fever.test.txt \
-bm25 -removeQuery -hits 1000 -fields contents=1.0 title=1.0 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ Sample indexing command:
target/appassembler/bin/IndexCollection \
-collection JsonVectorCollection \
-input /path/to/beir-v1.0.0-climate-fever-splade_distil_cocodenser_medium \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-splade_distil_cocodenser_medium/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-splade_distil_cocodenser_medium/ \
-threads 16 -impact -pretokenized \
>& logs/log.beir-v1.0.0-climate-fever-splade_distil_cocodenser_medium &
```
Expand All @@ -72,7 +72,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-splade_distil_cocodenser_medium/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.splade_distil_cocodenser_medium.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-climate-fever-splade_distil_cocodenser_medium.splade_distil_cocodenser_medium.topics.beir-v1.0.0-climate-fever.test.splade_distil_cocodenser_medium.txt \
-impact -pretokenized -removeQuery -hits 1000 &
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ Typical indexing command:
target/appassembler/bin/IndexCollection \
-collection JsonVectorCollection \
-input /path/to/beir-v1.0.0-climate-fever-unicoil-noexp \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-unicoil-noexp/ \
-generator DefaultLuceneDocumentGenerator \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-unicoil-noexp/ \
-threads 16 -impact -pretokenized \
>& logs/log.beir-v1.0.0-climate-fever-unicoil-noexp &
```
Expand All @@ -42,7 +42,7 @@ After indexing has completed, you should be able to perform retrieval as follows
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.beir-v1.0.0-climate-fever-unicoil-noexp/ \
-topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.unicoil-noexp.tsv.gz \
-topicreader TsvString \
-topicReader TsvString \
-output runs/run.beir-v1.0.0-climate-fever-unicoil-noexp.unicoil-noexp.topics.beir-v1.0.0-climate-fever.test.unicoil-noexp.txt \
-impact -pretokenized -removeQuery -hits 1000 &
```
Expand Down
Loading

0 comments on commit 9d34274

Please sign in to comment.