Set up Anaconda environment:
conda env create -f environment.yml
Activate environment:
conda activate text-forensics
- Experiment results are stored in the
results/directory paper-analysis.ipynbreproduces the figures/tables from the paper using these stored results- Contents of the
results/folder can be reproduced by downloading the datasets into thedatasets/directory, preprocessing the Amazon dataset (others are already formatted), and then running theexperiments.pycode using the commands inrun_experiments.sh
datasets/: this directory contains the downloaded datasets- DarkReddit Zenodo link (must request access from original authors Manolache et al. 2023)
- SilkRoad Zenodo link (must request access from original authors Manolache et al. 2023)
- Agora Zenodo link (must request access from original authors Manolache et al. 2023)
- Amazon DropBox link (publicly-accessible, original authors Halvani et al. 2017, link retrieved from Ishihara 2021)
- the preprocessed Amazon review dataset into pair format
preprocessing/:- process_amazon.py: python code for preprocessing the Amazon dataset into AV pairs
models/:- embeddings.py: implements the pre-trained authorship embedding models
- models.py: defines the SLR classes and implements training/test methods
utils/lexical.py: implements the lexical/character features used in the manual SLRexperiments.py: python code for running the experimentsbase_logger.py: sets up logging for status updates on experimentsrun_experiments.sh: script with command line arguments for running experimentsresults: stores experiment resultspaper-analysis.ipynb: notebook for reproducing results from the paperfigs/: stores figures from analysis