EmbedLLM: Learning Compact Representations of Large Language Models

This repository contains the official implementation of our paper:

EmbedLLM: Learning Compact Representations of Large Language Models (ICLR 2025 Spotlight)

By Richard Zhuang, Tianhao Wu, Zhaojin Wen, Andrew Li, Jiantao Jiao, and Kannan Ramchandran

An illustration of the EmbedLLM pipeline. An embedder network is pretrained from sample question-answer pairs from a pool of LLMs to map them into vector embeddings. Downstream applications like model routing are adapted by training an additional linear layer on top of these embeddings.

Installation

Clone this repository:

git clone https://github.com/yourusername/EmbedLLM.git
cd EmbedLLM

Create and activate the conda environment: (Remember to change the prefix path in the environment.yml file to the path of your conda environment)

conda env create -f environment.yml
conda activate embedllm

Dataset

The dataset used in our experiments is available on HuggingFace: https://huggingface.co/datasets/RZ412/EmbedLLM

We prepare a script to download some of the correctness data we used to train our model:

cd data_preprocessing
python download_data.py

For full dataset (including training set with various sizes and trained model embeddings), please refer to the above HuggingFace page.

To transform the benchmark questions into embeddings:

cd data_preprocessing
python get_question_embedding_tensor.py

Usage

KNN Model

To train a KNN model and evaluate its performance on correctness forecasting:

cd algorithm
python knn.py

Key arguments:

--input-format: Choose between 'tensor' or 'csv' input format (default: 'csv')
--num-neighbors: Number of neighbors for KNN (default: 131)
--save-tensors: Save processed CSV data as tensors for faster future loading

Matrix Factorization Model

To train our Matrix Factorization model and evaluate its performance:

cd algorithm
python mf.py

Key arguments:

--embedding-dim: Dimension of model embeddings (default: 232)
--alpha: Noise level for regularization (default: 0.05)
--batch-size: Training batch size (default: 2048)
--num-epochs: Number of training epochs (default: 50)
--eval-mode: Evaluation mode - 'correctness' or 'router' (default: 'correctness')

Citation

If you find this code useful for your research, please cite our paper:

@inproceedings{
zhuang2025embedllm,
title={Embed{LLM}: Learning Compact Representations of Large Language Models},
author={Richard Zhuang and Tianhao Wu and Zhaojin Wen and Andrew Li and Jiantao Jiao and Kannan Ramchandran},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=Fs9EabmQrJ}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
algorithm		algorithm
assets		assets
data_preprocessing		data_preprocessing
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EmbedLLM: Learning Compact Representations of Large Language Models

Installation

Dataset

Usage

KNN Model

Matrix Factorization Model

Citation

About

Releases

Packages

Languages

richardzhuang0412/EmbedLLM

Folders and files

Latest commit

History

Repository files navigation

EmbedLLM: Learning Compact Representations of Large Language Models

Installation

Dataset

Usage

KNN Model

Matrix Factorization Model

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages