From cf44319add46ce605b694023ed215fc7302e9709 Mon Sep 17 00:00:00 2001 From: Shaheen Nabi <84982228+shaheennabi@users.noreply.github.com> Date: Sun, 1 Dec 2024 13:03:35 -0800 Subject: [PATCH] Update README.md --- README.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index c50ed59..095239b 100644 --- a/README.md +++ b/README.md @@ -202,11 +202,13 @@ Remember: For this project **Pipeline** is going to be seprated in two different Loading  Model -- **`max_seq_length`**: Specifies the maximum token length for inputs, here set to 2048 tokens. -- **`dtype`**: Auto-detects the optimal data type for model weights, usually `float32` or `float16`. -- **`load_in_4bit`**: Enables 4-bit quantization, reducing memory usage while preserving model performance. -- **model_name** = unsloth/Llama-3.2-3B-Instruct, which will be used for fine-tuning and is sourced from Unsloth. -*We are getting **quantized_model** and **tokenizer** by passing these params into **FastLanguageModel.from_pretrained**.* +- **`max_seq_length`**: Specifies the maximum token length for inputs, set to 2048 tokens in this case. +- **`dtype`**: Auto-detects the optimal data type for model weights, typically `float32` or `float16`. +- **`load_in_4bit`**: Enables 4-bit quantization, reducing memory usage while maintaining model performance. +- **`model_name`**: `unsloth/Llama-3.2-3B-Instruct`, which will be used for fine-tuning and is sourced from Unsloth. + +*We obtain the **quantized_model** and **tokenizer** by passing these parameters into **FastLanguageModel.from_pretrained**.* +