diff --git a/README.md b/README.md index 2f3061c86e..9852b4f3c2 100644 --- a/README.md +++ b/README.md @@ -106,19 +106,20 @@ See individual pages for details! ### Available Corpora for Download -| Corpora | Size | Checksum | -|:------------------------------------------------------------------------------------------------------------------------------------------|-------:|:-----------------------------------| -| [Quantized BM25](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-bm25-b8.tar) | 1.2 GB | `0a623e2c97ac6b7e814bf1323a97b435` | -| [uniCOIL (noexp)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-unicoil-noexp.tar) | 2.7 GB | `f17ddd8c7c00ff121c3c3b147d2e17d8` | -| [uniCOIL (d2q-T5)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-unicoil.tar) | 3.4 GB | `78eef752c78c8691f7d61600ceed306f` | -| [uniCOIL (TILDE)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-unicoil-tilde-expansion.tar) | 3.9 GB | `12a9c289d94e32fd63a7d39c9677d75c` | -| [DeepImpact](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-deepimpact.tar) | 3.6 GB | `73843885b503af3c8b3ee62e5f5a9900` | -| [SPLADEv2](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-distill-splade-max.tar) | 9.9 GB | `b5d126f5d9a8e1b3ef3f5cb0ba651725` | -| [SPLADE++ CoCondenser-EnsembleDistil](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-splade-pp-ed.tar) | 4.2 GB | `e489133bdc54ee1e7c62a32aa582bc77` | -| [SPLADE++ CoCondenser-SelfDistil](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-splade-pp-sd.tar) | 4.8 GB | `cb7e264222f2bf2221dd2c9d28190be1` | -| [cosDPR-distil](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-cos-dpr-distil.tar) | 57 GB | `e20ffbc8b5e7f760af31298aefeaebbd` | -| [BGE-base-en-v1.5](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-bge-base-en-v1.5.tar) | 59 GB | `353d2c9e72e858897ad479cca4ea0db1` | -| [OpenAI-ada2](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-openai-ada2.tar) | 109 GB | `a4d843d522ff3a3af7edbee789a63402` | +| Corpora | Size | Checksum | +|:-------------------------------------------------------------------------------------------------------------------|-------:|:-----------------------------------| +| [Quantized BM25](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-bm25-b8.tar) | 1.2 GB | `0a623e2c97ac6b7e814bf1323a97b435` | +| [uniCOIL (noexp)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-unicoil-noexp.tar) | 2.7 GB | `f17ddd8c7c00ff121c3c3b147d2e17d8` | +| [uniCOIL (d2q-T5)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-unicoil.tar) | 3.4 GB | `78eef752c78c8691f7d61600ceed306f` | +| [uniCOIL (TILDE)](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-unicoil-tilde-expansion.tar) | 3.9 GB | `12a9c289d94e32fd63a7d39c9677d75c` | +| [DeepImpact](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-deepimpact.tar) | 3.6 GB | `73843885b503af3c8b3ee62e5f5a9900` | +| [SPLADEv2](https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/data/msmarco-passage-distill-splade-max.tar) | 9.9 GB | `b5d126f5d9a8e1b3ef3f5cb0ba651725` | +| [SPLADE++ CoCondenser-EnsembleDistil](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-splade-pp-ed.tar) | 4.2 GB | `e489133bdc54ee1e7c62a32aa582bc77` | +| [SPLADE++ CoCondenser-SelfDistil](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-splade-pp-sd.tar) | 4.8 GB | `cb7e264222f2bf2221dd2c9d28190be1` | +| [cosDPR-distil](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-cos-dpr-distil.tar) | 57 GB | `e20ffbc8b5e7f760af31298aefeaebbd` | +| [BGE-base-en-v1.5](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-bge-base-en-v1.5.tar) | 59 GB | `353d2c9e72e858897ad479cca4ea0db1` | +| [OpenAI-ada2](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-openai-ada2.tar) | 109 GB | `a4d843d522ff3a3af7edbee789a63402` | +| [Cohere embed-english-v3.0](https://rgw.cs.uwaterloo.ca/pyserini/data/msmarco-passage-cohere-embed-english-v3.tar) | 38 GB | `6b7d9795806891b227378f6c290464a9` |
diff --git a/docs/regressions-log.md b/docs/regressions-log.md index 92da0ba744..89dc3303e5 100644 --- a/docs/regressions-log.md +++ b/docs/regressions-log.md @@ -3,6 +3,21 @@ The following change log details commits to regression tests that alter effectiveness and the addition of new regression tests. This documentation is useful for figuring why results may have changed over time. +### February 14, 2024 + ++ commit [`ce8c2a`](https://github.com/castorini/anserini/commit/ce8c2a901f1104cf0205ac7abdcf36776824c09c) (2024/02/14) ++ commit [`8d4a7f`](https://github.com/castorini/anserini/commit/8d4a7f1a5c85a653b9ec245041e94dc4a62f68ce) (2024/02/14) ++ commit [`9a5bb6`](https://github.com/castorini/anserini/commit/9a5bb6050833daecaa541f045db6b9d64171fd07) (2024/02/12) ++ commit [`57d262`](https://github.com/castorini/anserini/commit/57d2627e57d17121465bd535c8f654877142b9cc) (2024/02/11) ++ commit [`f86a65`](https://github.com/castorini/anserini/commit/f86a65f43eb15d88b7a003a1edf541d9d60c3056) (2024/02/09) ++ commit [`f2e2ac`](https://github.com/castorini/anserini/commit/f2e2ac35a5408d644ad402b75df602b7d929731d) (2024/01/25) + +New regressions: + ++ SPLADE++ ED w/ ONNX for BEIR ++ BGE with original and quantized HNSW indexes for BEIR (only pre-encoded queries) ++ Cohere embed-english-v3 for MS MARCO passage dev (but not DL19 or DL20) + ### January 21, 2024 + commit [`ca20dc`](https://github.com/castorini/anserini/commit/ca20dcda6ecf12930092f26416bad715baa861f2) (1/21/2024)