Installation

Set up a python environment and run pip install -r requirements.txt

Usage

Translation and intervention

python3 main_rejection_experiment.py --model_name meta-llama/Llama-2-7b-hf

Will save results in out_iclr folder.

Plotting

python plot_rejection_experiment.py --model_name meta-llama/Llama-2-7b-hf

Plots the probability assigned to target word during translation from source word, for each pair of languages.
Plots the probability assigned to target word during translation from source word, if we intervene and project out unembedding vectors for the correct word in the latent language.
Plots the probability assigned to target word during translation from source word, if we intervene and project out unembedding vectors for a random word in the latent language, and the correct word in the target language.

Acknowledgements

Starting point of this repo was Nina Rimsky's Llama-2 wrapper.

Citation

@article{wendler2024llamas,
  title={Do Llamas Work in English? On the Latent Language of Multilingual Transformers},
  author={Wendler, Chris and Veselovsky, Veniamin and Monea, Giovanni and West, Robert},
  journal={arXiv preprint arXiv:2402.10588},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
OLD		OLD
OLD_llama		OLD_llama
R		R
data		data
out		out
out_iclr		out_iclr
run_scripts		run_scripts
src		src
test		test
utils		utils
visuals/meta-llama/Llama-2-7b-hf/translation		visuals/meta-llama/Llama-2-7b-hf/translation
.gitignore		.gitignore
README.md		README.md
imports.py		imports.py
main.py		main.py
main_rejection.py		main_rejection.py
main_rejection_experiment.py		main_rejection_experiment.py
main_rejection_test.py		main_rejection_test.py
main_test_quantized.py		main_test_quantized.py
plot_rejection_experiment.py		plot_rejection_experiment.py
requirements.txt		requirements.txt
run_main_rejection_experiment.sh		run_main_rejection_experiment.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Usage

Translation and intervention

Plotting

Acknowledgements

Citation

About

Releases

Packages

Languages

davidquarel/llm-latent-language

Folders and files

Latest commit

History

Repository files navigation

Installation

Usage

Translation and intervention

Plotting

Acknowledgements

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages