Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
shaheennabi authored Dec 3, 2024
1 parent b3d7ca8 commit f47b4c9
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ Remember: For this project **Pipeline** is going to be seprated in two different
*Note: Fine-tuning code will be entirely modular, but I have used **Google Colab** for training, if you have high-end machine make sure you execute **pipeline** in modular fashin*

## Fine-tuning Pipeline 💥
**Note:** The fine-tuning pipeline code is modularized in the `src/finetuning` folder of this repository. If you have access to **high-performance resources** like AWS SageMaker or high-end GPUs, you can execute the modularized files in sequence: start with the **Trainer** to fine-tune the model, then proceed to **Inference** for generating predictions, followed by the **Merge Models** file to combine the fine-tuned model with the base model, and finally, use the **Push to S3** script to upload the final model and tokenizer to your S3 bucket. However, if you lack access to higher-end GPUs or a cloud budget, I recommend using **Google Colab's free tier**. In this case, skip the modularized part and directly execute the provided Jupyter Notebook to fine-tune the model, then upload the `model` and `tokenizer` directly to S3 from the Colab notebook. **Caution:** The modularized pipeline has not been tested thoroughly because I do not have access to high-end compute resources. If you encounter issues while running the pipeline, please raise an issue in the repository, and I will address it immediately.
**Note:** The fine-tuning pipeline code is modularized in the `src/finetuning` folder of this repository. If you have access to **high-performance resources** like AWS SageMaker or high-end GPUs, you can execute the modularized files in sequence: start with the **Trainer** to fine-tune the model, then proceed to **Inference** for generating predictions, followed by the **Merge Models** file to combine the fine-tuned model with the base model, and finally, use the **Push to S3** script to upload the final model and tokenizer to your S3 bucket. However, if you lack access to higher-end GPUs or a cloud budget, I recommend using **Google Colab's free tier**. In this case, skip the modularized part and directly execute the provided Jupyter Notebook inside `notebooks/` to fine-tune the model, then upload the `model` and `tokenizer` directly to S3 from the Colab notebook. **Caution:** The modularized pipeline has not been tested thoroughly because I do not have access to high-end compute resources. If you encounter issues while running the pipeline, please raise an issue in the repository, and I will address it immediately.

---
### Installing the required libraries
Expand Down

0 comments on commit f47b4c9

Please sign in to comment.