Implementation of the paper "Extracting Training Data from Large Language Models"(Carlini et al, 2020)
- (Optional) Change model type and hyperparameters at
config.yaml
- Text sampling from the victim language model
- Run
python inference.py
for single-gpu generation from the victim language model. - Run
python parallel_inference.py
for faster generation from the victim language model.
- Run
- Run
python rerank.py
to retrieve possibly memorized text sequence candidates
- Prevents oversampling during the prefix selection
- Speeds up the inference with parallel Multi-GPU usage (only for gpt2-large)
- Clears up GPU VRAM memory usage after the corresponding task
- Rules out 'low-quality repeated generations' with repetition penalty and with ngram restriction
- Supports T5 Encoder-Decoder as the victim model
- Speeds up the reranking with parallel Multi-GPU usage