Add to onboarding reproduction logs #2621
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Operating System and Setup:
I am using a MacBook Pro M1 with 16 GB of RAM, running macOS. The environment I used includes OpenJDK 21.0.4 and Maven 3.9.9.
Description of Issues and Solution:
While following the guide, I encountered some issues when using the latest commit from the repository. Specifically, I faced errors during the build process, and even when the build succeeded, I encountered an exception when running the bin/run.sh command to build the Lucene inverted index from the JSON collection.
The error message I encountered was:
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.SecurityException: Invalid signature file digest for Manifest main attributes
at java.base/sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:340)
at java.base/sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:282)
...
Upon investigation, I discovered that the run.sh script was referencing a file named anserini-0.38.1-SNAPSHOT-fatjar.jar, indicating that it was a development snapshot rather than a stable release.
Solution:
To resolve the issue, I decided to manually download the stable version of the fatjar using:
wget https://repo1.maven.org/maven2/io/anserini/anserini/0.38.0/anserini-0.38.0-fatjar.jar
This allowed me to successfully build the inverted index.
However, when I ran the retrieval process and executed:
cut -f 1 runs/run.msmarco-passage.dev.small.tsv | uniq | wc
I noticed that only 5,000 queries were listed, instead of the expected 6,980.
To resolve this, I ultimately decided to use the entire repository from the stable commit for version v0.38.0 (the same version as the stable fatjar). After switching to this version, everything worked as expected, and I was able to reproduce the correct results.
Suggestions:
Overall, the guide was very helpful, but I would suggest: