Skip to content

Commit

Permalink
Merge branch 'development' of https://github.com/shaheennabi/Producti…
Browse files Browse the repository at this point in the history
  • Loading branch information
shaheennabi committed Dec 1, 2024
2 parents 96d4ecc + fa547c3 commit 96ca6b1
Showing 1 changed file with 87 additions and 6 deletions.
93 changes: 87 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
# 🎋🌿 **Production-Ready Instruction Fine-Tuning of Meta LLaMA 3.2 3B Instruct Project** 🌿🎉
updating soon:

## **Problem Statement**
---
Expand Down Expand Up @@ -29,11 +28,11 @@ To achieve this, we are leveraging the **Hugging Face dataset** `charanhu/kannad

### **My Role as a Developer** 🎋

As a developer, I am responsible for delivering a fine-tuned **LLaMA 3.2 3B** model that aligns with the defined **Key Performance Indicator (KPI)** objectives and ensures exceptional performance for Kannada-speaking users.
As a developer, I am responsible for delivering a Instruction fine-tuned **LLaMA 3.2 3B** model that aligns with the defined **Key Performance Indicator (KPI)** objectives and ensures exceptional performance for Kannada-speaking users.

- I will **instruct fine-tune** the model using the high-quality **Kannada dataset** from **Hugging Face** (`charanhu/kannada-instruct-dataset-390k`).

- To address the constraints of **limited GPU resources**, I will implement **QLoRA-based 4-bit precision quantization** using **BitsAndBytes**, which involves:
- To address the constraints of **limited GPU resources**, I will implement **QLoRA-based 4-bit precision quantization** using **Unsloth**, which involves:
- First **quantizing the model** to 4-bit precision to reduce computational overhead.
- Adding **LoRA (Low-Rank Adaptation) layers** to fine-tune the model efficiently within **Google Colab**, ensuring optimal resource utilization without compromising performance.

Expand Down Expand Up @@ -147,19 +146,101 @@ Remember: For this project **Pipeline** is going to be seprated in two different
![Finetuning Pipeline](https://github.com/user-attachments/assets/bc09764b-b5a1-4614-b872-cc6d9cd88bdc)


*Note: Fine-tuning code will be entirely modular, but I have used **Google Colab** for training, if you have high-end machine make sure you execute **pipeline** in modular fashin*

## Fine-tuning Pipeline 💥
---
### Installing the required libraries
* Unsloth gives a lot of issues while installing, so execute these code cells one by one in sequence to avoid any problems.

````bash
# Run this first (cell 1)
!python -m pip install --upgrade pip
!pip install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
!pip install xformers[torch2] # Install xformers built for PyTorch 2.x
!pip install "unsloth[colab] @ git+https://github.com/unslothai/unsloth.git"
!pip install "git+https://github.com/huggingface/transformers.git"
!pip install trl
!pip install boto3
````

---
```bash
# Run this cell (cell 2)
!pip install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # Upgrade PyTorch to a compatible version
!pip install xformers # Install xformers after upgrading PyTorch
```

```bash
# cell 3
!pip uninstall torch torchvision torchaudio -y # Uninstall existing PyTorch, torchvision, and torchaudio
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # Install PyTorch, torchvision, and torchaudio with CUDA 11.8
```

```bash
# cell 4
!pip uninstall xformers -y
!pip install xformers[torch2] # Install xformers built for PyTorch 2.x
```

### Importing Necessary Libraries

<img width="656" alt="Importing Necessary Libraries" src="https://github.com/user-attachments/assets/dfb4fdee-0513-4202-b5d1-167e15689354">

### Loading the Model

<img width="640" alt="Loading Model" src="https://github.com/user-attachments/assets/89013450-1bb1-4a29-9ad4-2a620004064e">

### Applying Lora layers

<img width="620" alt="Applying Lora" src="https://github.com/user-attachments/assets/062a2115-d24d-4ede-9c83-2fc9665cdaa1">

### Data Preparation

<img width="920" alt="Dataset Preparation" src="https://github.com/user-attachments/assets/869f6569-df05-455f-bd7e-ba71dc036593">

### Data Formatting(what model expects for instruction tuning)

<img width="920" alt="Prompt Formatting" src="https://github.com/user-attachments/assets/58f7c5cf-945a-43d7-a9cf-670eee3261e6">


### Training Configurations

<img width="614" alt="Training Configuration" src="https://github.com/user-attachments/assets/956acc04-ac6f-497b-9c12-9cc33b70301b">

## Deployment/Inference Pipeline 💥
---

### Model Training

<img width="856" alt="Model Training" src="https://github.com/user-attachments/assets/075ee343-8412-4ad4-bb4b-dd569663c4fd">

### Inference

<img width="713" alt="Inference 1" src="https://github.com/user-attachments/assets/189c2d17-9026-4cb3-bdfb-95435b075fae">

<img width="901" alt="Inference 2" src="https://github.com/user-attachments/assets/ea31462b-9e1c-4575-9120-5390cfbc23e2">

### Saving the Model & Tokenizer

<img width="453" alt="Saving the model and tokenizer" src="https://github.com/user-attachments/assets/f6eb0858-f51e-452d-a65b-83945537e487">

### Merging base model & finetuned lora layers

<img width="557" alt="Merge base model and finetuned layers" src="https://github.com/user-attachments/assets/15d66a2b-dfb9-471c-8fe0-9b13640d45e4">


### Pushing Model & Tokenizer to S3 Bucket


<img width="399" alt="Pushing to s3 1" src="https://github.com/user-attachments/assets/06948b95-59a6-4ad5-b530-90e075cc88f9">


<img width="527" alt="Pushing to s3 2" src="https://github.com/user-attachments/assets/2d944deb-b2f1-475a-834e-d462bb08fffb">

<img width="505" alt="Pushing to s3 3" src="https://github.com/user-attachments/assets/7fd11f13-57f2-43b0-b3e9-918e89b91b12">


---


## Ok, so now let's Talk about the Deployment/Inference Pipeline 🚀

*This is the diagram, of how the pipeline will look:*
Expand Down

0 comments on commit 96ca6b1

Please sign in to comment.