Llama LOL

An experiment in making a funny LLM

Blog: https://philliphaeusler.com/posts/llama_lol/

Train

For this you will need a large GPU > 20 Gbs of VRAM

python3 train.py

To get new jokes run

python3 sample.py

The can scrape more data from youtube with python yt.py

Just add additional videos to the python script.

Download and build whisper for (higher quality) transcription

git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp
bash ./models/download-ggml-model.sh large
make

Copy the txt files into /data

Clean up the data

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
README.md		README.md
sample.py		sample.py
train.py		train.py
yt.py		yt.py