diff --git a/README.md b/README.md
index ecdbd1d..9d6462a 100644
--- a/README.md
+++ b/README.md
@@ -7,11 +7,10 @@ necessary used for model training.
 
 - `language_selection`: notebooks and file with language to file extensions mapping used to build the Stack v1.1.
 - `pii`: code for running PII detection and anonymization on code datasets.
+- `decontamination`: script to remove files that match test-samples from code generation benchmarks.
 - `preprocessing`: code for filtering code datasets based on:
   - line length and percentage of alphanumeric characters (basic filter)
-  - number of stars.
-  - comments to code ratio.
-  - tokenizer fertility
+  - number of stars, comments to code ratio, tokenizer fertility
   - Additionnal filters used for StarCoder Training:
     - basic-filter with parameters that depend on the file's extension.
     - filter to remove XML files
@@ -20,6 +19,5 @@ necessary used for model training.
     - code to generate full-content with meta (repo-name, filename, num stars) for training
   - Filters for GitHub Issues
   - Filters for Git Commits
-  - Script to convert Jupyter notebooks to scripts
-  - Scripts to convert Jupyter notebooks to structured markdown-code-output triplets
-- `decontamination`: script to remove files that match test-samples from code generation benchmarks.
+  - Code to convert Jupyter notebooks to scripts
+  - Code to convert Jupyter notebooks to structured markdown-code-output triplets