This repository is a working example for the SISAP 2025 Indexing Challenge https://sisap-challenges.github.io/, working with Julia and GitHub Actions, as specified in Task's descriptions. It is based on the previous year example.
You can find the code for the evaluation stage is in the eval branch https://github.com/sisap-challenges/sisap25-example-julia/tree/eval.
It requires a working installation of Julia (better works with v1.10.8. Note that v1.11 series have performance regressions for our similarity search package.
You can download Julia from https://julialang.org/downloads/. You also need git tools and internet access for cloning, install package dependencies and downloading datasets.
The steps are the following:
- Clone the example repository
- Fetch the datasets
- Instantiate the project
- Run
- Evaluation
the fullset of instructions are listed in the GitHub Actions workflow
https://github.com/sisap-challenges/sisap25-example-julia/blob/main/.github/workflows/ci.yml
Note that you will need to adjust your scripts to hold the correct hyperparameters for any benchmark you use, in particular, pubmed23 and gooaq which will be used in the testing stage.
git clone https://github.com/sisap-challenges/sisap25-example-julia
cd sisap25-example-julia
You can clone our dataset repository or fetch dataset by dataset. For instance, if you want to test this example you can download the smallest dataset CCNEWS dataset, even with a reduced precision (fp16) and with a shortened gold-standard (the main files contain
Run the prepare-data.sh script to download the necessary CCNEWS data. Note that we expect to use this small dataset for your GitHub Actions.
For your participation and experimentation you may want to clone the entire dataset repository https://huggingface.co/datasets/sadit/SISAP2025/;
as follows:
git clone https://huggingface.co/datasets/sadit/SISAP2025The current example look for a data directory so you can rename the local repository or do a symbolic link
ln -s SISAP2025 data Julia requires to prepare the working directory through an instantiation, a.k.a. installing dependencies, as follows:
JULIA_PROJECT=. JULIA_NUM_THREADS=8 julia -e 'using Pkg; Pkg.instantiate()'You need internet access for this step.
A similar procedure is needed to run; note that Julia may compile many packages in the first run, so please be patient.
JULIA_PROJECT=. JULIA_NUM_THREADS=8 julia -L sisap2025.jl -e 'main_task1(); main_task2()'You should modify the number of threads to adapt your hardware; you can also try to add optimization flags to julia, e.g., call it julia -O3 -Cnative.
JULIA_PROJECT=. julia -L eval.jl -e 'eval_task1()'
JULIA_PROJECT=. julia -L eval.jl -e 'eval_task2()'Two result files will be created: result-task1.csv and result-task2.csv.
You can fork this repository and polish it to create your solution. Please also take care of the ci workflow (see below).
You can monitor your runnings in the "Actions" tab of the GitHub panel: for instance, you can see some runs of this repository: https://github.com/sisap-challenges/sisap25-example-julia/actions