We assume you have followed the instructions described in the main Readme file in order to create the StopsGB dataset from scratch. If you have, you will only need to run the following script (warning: it will take several hours):
python apply_to_all_stations.py
This will perform the whole linking pipeline (finding candidates with DeezyMatch, retrieving features for each candidate, and outputing the final dataset). The resulting dataset (stored as station-to-station/processed/resolution/StopsGB.tsv
) is also available on the British Library research repository.