Skip to content

Commit

Permalink
add readme
Browse files Browse the repository at this point in the history
  • Loading branch information
loubnabnl committed Aug 24, 2023
1 parent 1f1b896 commit 5cd79f9
Showing 1 changed file with 14 additions and 0 deletions.
14 changes: 14 additions & 0 deletions pii/ner/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Fine-tuning Bigcode-Encoder on an NER task for PII detection

To run the training on all the dataset `bigcode/pii-full-ds`, use the following command:
```bash
python -m torch.distributed.launch \
--nproc_per_node number_of_gpus train.py \
--dataset_name bigcode/pii-full-ds \
--debug \
--learning_rate 2e-5 \
--train_batch_size 8 \
--bf16 \
--add_not_curated
```
Note that we use a global batch size of 64 (8*8 GPUs). To use only curated dataset remove the flag `--add_not_curated`.

0 comments on commit 5cd79f9

Please sign in to comment.