This tool was developed as part of a bachelor thesis to automatically recognize cookie notices on websites.
- Chromium (or Google Chrome) browser
- Python3
- pipenv
$ pipenv install
First, start the browser in automation mode using the debugging port 9222
.
For Mac users:
$ ./run-chromium.sh
On a Linux server, create a display first and then start the browser:
$ sudo apt install xvfb
$ sudo Xvfb :10 -ac -screen 0 1400x950x24 &
$ DISPLAY=:10 ./run-chromium.sh
Afterwards, stop the display again:
$ jobs -l
$ sudo kill PID_OF_JOB
Then, run the script scan.py
:
$ pipenv run python scan.py
The script scan.py
has multiple options including a help option:
$ pipenv run python scan.py --help
usage: scan.py [-h] [--dataset [DATASET]] [--start [START_RANK]]
[--end [END_RANK]] [--results [RESULTS_DIRECTORY]] [--click]
Scans a list of domains, identifies cookie notices and evaluates them.
optional arguments:
-h, --help show this help message and exit
--dataset [DATASET] the set of domains to scan: `1` for the top 2000
domains, `2` for domains in file `resources/sampled-
domains.txt`
--start [START_RANK] the rank to start the scanning from, including the
given rank (default: 1)
--end [END_RANK] the rank to end the scanning at, including the given
rank, -1 if the dataset should be scanned to the end
(default: -1)
--results [RESULTS_DIRECTORY]
the directory to store the the results in (default:
`results`)
--click whether links and buttons in the detected cookie
notices should be clicked and analyzed or not
(default: false)