Skip to content

Commit

Permalink
fix typo
Browse files Browse the repository at this point in the history
  • Loading branch information
loubnabnl authored Aug 24, 2023
1 parent 1187a93 commit b824c1a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion pii/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ We provide code to detect Names, Emails, IP addresses, Passwords API/SSH keys in
For the **NER** model based approach (e.g [StarPII](https://huggingface.co/bigcode/starpii)), please go to the `ner` folder.

We provide the code used for training a PII NER model to detect : Names, Emails, Keys, Passwords & IP addresses (more details in our paper: [StarCoder: May The Source Be With You](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view)). You will also find the code (and `slurm` scripts) used for running PII Inference on [StarCoderData](https://huggingface.co/datasets/bigcode/starcoderdata), we were able to detect PII in 800GB of text in 800 GPU-hours on A100 80GB. To replace secrets we used teh following tokens:
<NAME>, <EMAIL>, <KEY>, <PASSWORD>
`<NAME>, <EMAIL>, <KEY>, <PASSWORD>`
To mask IP addresses, we randomly selected an IP address from 5~synthetic, private, non-internet-facing IP addresses of the same type.

## Regex approach
Expand Down

0 comments on commit b824c1a

Please sign in to comment.