Update TREC-COVID docs to note recent regression changes (#2202)

castorini · Sep 21, 2023 · 3407954 · 3407954
1 parent 444eacc
commit 3407954
Show file tree

Hide file tree

Showing 2 changed files with 36 additions and 4 deletions.
diff --git a/docs/experiments-covid-doc2query.md b/docs/experiments-covid-doc2query.md
@@ -1,13 +1,26 @@
 # TREC-COVID doc2query Baselines
 
-**Important Note (Lucene 8 to Lucene 9 Upgrade):**
+**Important reproducibility notes:**
 Anserini was upgraded to Lucene 9.3 at commit [`272565`](https://github.com/castorini/anserini/commit/27256551e958f39495b04e89ef55de9d27f33414) (8/2/2022).
 This upgrade created backward compatibility issues (see [#1952](https://github.com/castorini/anserini/issues/1952)), which means that the runs described on this page cannot be _exactly_ reproduced with Lucene 9 code running on Lucene 8 indexes (since we need to disable consistent tie-breaking).
 
-Thus, this page is no longer being maintained.
+Following the Lucene upgrade, this page is no longer being maintained.
 For reproducibility purposes, however, runs with Lucene 8 (at v0.14.4) and Lucene 9 (at [`5480dc`](https://github.com/castorini/anserini/commit/5480dc88d0bfdd2cb67ef0ca4271223ed13c1ea5)) are captured and stored [here](../src/main/python/trec-covid/logs).
 There are only minor differences in effectiveness between the two sets of runs.
 
+In September 2023, the regression code was refactored such that the following commands run successfully (commits [`88935f`](https://github.com/castorini/anserini/commit/88935fc9431dbb81d55883547c185c4d1f44bf36) and [`444eac`](https://github.com/castorini/anserini/commit/444eacc20e18edec472ad1a673b90f57dc60266d)):
+
+```bash
+python src/main/python/trec-covid/download_doc2query_indexes.py --date 2020-07-16 &
+python src/main/python/trec-covid/download_doc2query_indexes.py --date 2020-06-19 &
+
+nohup python src/main/python/trec-covid/generate_round5_doc2query_baselines.py >& logs/log.trec-covid.round5-docTTTTTquery &
+nohup python src/main/python/trec-covid/generate_round4_doc2query_baselines.py >& logs/log.trec-covid.round4-docTTTTTquery &
+```
+
+Specifically, the effectiveness of the runs generated by the scripts match the scores encoded in the scripts.
+However, the scores vary (in most cases, only slightly) from the scores reported below.
+
 ---
 
 This document describes various doc2query baselines for the [TREC-COVID Challenge](https://ir.nist.gov/covidSubmit/), which uses the [COVID-19 Open Research Dataset (CORD-19)](https://pages.semanticscholar.org/coronavirus-research) from the [Allen Institute for AI](https://allenai.org/).

diff --git a/docs/experiments-covid.md b/docs/experiments-covid.md
@@ -1,13 +1,32 @@
 # TREC-COVID Baselines
 
-**Important Note (Lucene 8 to Lucene 9 Upgrade):**
+**Important reproducibility notes:**
 Anserini was upgraded to Lucene 9.3 at commit [`272565`](https://github.com/castorini/anserini/commit/27256551e958f39495b04e89ef55de9d27f33414) (8/2/2022).
 This upgrade created backward compatibility issues (see [#1952](https://github.com/castorini/anserini/issues/1952)), which means that the runs described on this page cannot be _exactly_ reproduced with Lucene 9 code running on Lucene 8 indexes (since we need to disable consistent tie-breaking).
 
-Thus, this page is no longer being maintained.
+Following the Lucene upgrade, this page is no longer being maintained.
 For reproducibility purposes, however, runs with Lucene 8 (at v0.14.4) and Lucene 9 (at [`5480dc`](https://github.com/castorini/anserini/commit/5480dc88d0bfdd2cb67ef0ca4271223ed13c1ea5)) are captured and stored [here](../src/main/python/trec-covid/logs).
 There are only minor differences in effectiveness between the two sets of runs.
 
+In September 2023, the regression code was refactored such that the following commands run successfully (commits [`88935f`](https://github.com/castorini/anserini/commit/88935fc9431dbb81d55883547c185c4d1f44bf36) and [`444eac`](https://github.com/castorini/anserini/commit/444eacc20e18edec472ad1a673b90f57dc60266d)):
+
+```bash
+python src/main/python/trec-covid/download_indexes.py --date 2020-07-16 &
+python src/main/python/trec-covid/download_indexes.py --date 2020-06-19 &
+python src/main/python/trec-covid/download_indexes.py --date 2020-05-19 &
+python src/main/python/trec-covid/download_indexes.py --date 2020-05-01 &
+python src/main/python/trec-covid/download_indexes.py --date 2020-04-10 &
+
+nohup python src/main/python/trec-covid/generate_round5_baselines.py >& logs/log.trec-covid.round5 &
+nohup python src/main/python/trec-covid/generate_round4_baselines.py >& logs/log.trec-covid.round4 &
+nohup python src/main/python/trec-covid/generate_round3_baselines.py >& logs/log.trec-covid.round3 &
+nohup python src/main/python/trec-covid/generate_round2_baselines.py >& logs/log.trec-covid.round2 &
+nohup python src/main/python/trec-covid/generate_round1_baselines.py >& logs/log.trec-covid.round1 &
+```
+
+Specifically, the effectiveness of the runs generated by the scripts match the scores encoded in the scripts.
+However, the scores vary (in most cases, only slightly) from the scores reported below.
+
 ---
 
 This document describes various baselines for the [TREC-COVID Challenge](https://ir.nist.gov/covidSubmit/), which uses the [COVID-19 Open Research Dataset (CORD-19)](https://pages.semanticscholar.org/coronavirus-research) from the [Allen Institute for AI](https://allenai.org/).