Skip to content

Commit b824c1a

Browse files
authored
fix typo
1 parent 1187a93 commit b824c1a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

pii/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ We provide code to detect Names, Emails, IP addresses, Passwords API/SSH keys in
55
For the **NER** model based approach (e.g [StarPII](https://huggingface.co/bigcode/starpii)), please go to the `ner` folder.
66

77
We provide the code used for training a PII NER model to detect : Names, Emails, Keys, Passwords & IP addresses (more details in our paper: [StarCoder: May The Source Be With You](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view)). You will also find the code (and `slurm` scripts) used for running PII Inference on [StarCoderData](https://huggingface.co/datasets/bigcode/starcoderdata), we were able to detect PII in 800GB of text in 800 GPU-hours on A100 80GB. To replace secrets we used teh following tokens:
8-
<NAME>, <EMAIL>, <KEY>, <PASSWORD>
8+
`<NAME>, <EMAIL>, <KEY>, <PASSWORD>`
99
To mask IP addresses, we randomly selected an IP address from 5~synthetic, private, non-internet-facing IP addresses of the same type.
1010

1111
## Regex approach

0 commit comments

Comments
 (0)