Scripts to fetch RIBs from Route Views and RIPE RIS and merge their information into a single prefix-to-AS mapping. Allows creation of both current and historical mappings by specifying the corresponding timestamp.
The mapping is conservative by default. The following prefixes are ignored:
- Prefixes with origin AS sets encoded in the RIB
- Prefixes for which peers disagree on the origin (multi-origin prefixes)
- Prefixes that are not globally reachable according to Python's ipaddress module, which is based on the IANA Special-Purpose Address Registries (IPv4; IPv6).
In addition, during the merging process a minimum number or ratio of collectors can be specified that is required to see a prefix in order for the prefix to be included in the final radix tree.
You can either install the dependencies or use Docker.
Install the required dependencies.
pip install -r requirements.txtbgpkit-parser is required to read the RIB files.
The Docker services come in two flavors:
ribexplorer-mount: Mount the folders in this directory in the container. Enables direct file access.ribexplorer-volume: Folders are mounted from Docker volumes instead.
In both cases the merged folder from this directory is mounted in the container.
All scripts can be either called directly or via docker compose. The Docker syntax
starts with a command (see below) and all following parameters are passed to the Python script.
Build an index of all currently available RIS and Route Views collectors.
# Direct
python3 ./build-index.py
# Docker
docker compose run --rm ribexplorer-mount indexUse the index file to download RIBs for all available collectors for the specified timestamp.
# Direct
python3 ./fetch-snapshots.py YYYY-mm-ddTHH:MM
# Docker
docker compose run --rm ribexplorer-mount fetch YYYY-mm-ddTHH:MMNotes:
- By default, the script fetches data with four threads in parallel. Use
-nto adjust the number of threads. - If no file that matches the exact timestamp is found, the next-closest is used, up to
a certain threshold. The default maximum difference is 24 hours, but can be changed by adjusting
the
max_timestamp_differenceinfetchers/__init__.py.
Transform the downloaded RIBs to radix trees.
# Direct
python3 ./transform-snapshots.py YYYY-mm-ddTHH:MM
# Docker
docker compose run --rm ribexplorer-mount transform YYYY-mm-ddTHH:MMNotes:
- Like above, the number of parallel threads (for computation this time) can be adjusted
with the
-nparameter. - The timestamp difference threshold can be adjusted with the
--max-timestamp-differenceparameter to set the maximum difference in hours. - During the transformation some sanitation is applied as well. Prefixes with origin AS
sets are ignored and singleton sets of the form
{ASXXXX}are resolved. In addition, if the peers of a collector disagree about the origin for a prefix, it is also ignored. There are no AS sets in the produced radix trees.
Merge the radix trees into a single file.
# Direct
python3 ./create-merged-rtree.py YYYY-mm-ddTHH:MM output.pickle.bz2
# Docker
docker compose run --rm ribexplorer-mount create YYYY-mm-ddTHH:MM output.pickle.bz2Notes:
- The output file is created in the
/mergedfolder by default. Use the--output-dirparameter to change this location (does not work with Docker). - If collectors disagree about the origin for a prefix, that prefix is ignored.
- A minimum number or ratio of collectors can be specified using the
--min-collector-ratioor--min-collector-countparameters. If a prefix is seen by fewer collectors, it is ignored.
The transformed (per RIB) radix trees follow our usual structure:
{'as': str(asn)}The merged radix tree includes additional information about the collectors that see each prefix:
{
'as': str(asn),
'seen_by_collectors': tuple(str(collector), ...)
}