Skip to content

Repo for EmbedLLM: Learning Compact Representations of Large Language Models

Notifications You must be signed in to change notification settings

richardzhuang0412/EmbedLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EmbedLLM: Learning Compact Representations of Large Language Models

This repository contains the official implementation of our paper:

EmbedLLM: Learning Compact Representations of Large Language Models (ICLR 2025 Spotlight)

By Richard Zhuang, Tianhao Wu, Zhaojin Wen, Andrew Li, Jiantao Jiao, and Kannan Ramchandran

An illustration of the EmbedLLM pipeline. An embedder network is pretrained from sample question-answer pairs from a pool of LLMs to map them into vector embeddings. Downstream applications like model routing are adapted by training an additional linear layer on top of these embeddings.

Installation

  1. Clone this repository:
git clone https://github.com/yourusername/EmbedLLM.git
cd EmbedLLM
  1. Create and activate the conda environment: (Remember to change the prefix path in the environment.yml file to the path of your conda environment)
conda env create -f environment.yml
conda activate embedllm

Dataset

The dataset used in our experiments is available on HuggingFace: https://huggingface.co/datasets/RZ412/EmbedLLM

We prepare a script to download some of the correctness data we used to train our model:

cd data_preprocessing
python download_data.py

For full dataset (including training set with various sizes and trained model embeddings), please refer to the above HuggingFace page.

To transform the benchmark questions into embeddings:

cd data_preprocessing
python get_question_embedding_tensor.py

Usage

KNN Model

To train a KNN model and evaluate its performance on correctness forecasting:

cd algorithm
python knn.py

Key arguments:

  • --input-format: Choose between 'tensor' or 'csv' input format (default: 'csv')
  • --num-neighbors: Number of neighbors for KNN (default: 131)
  • --save-tensors: Save processed CSV data as tensors for faster future loading

Matrix Factorization Model

To train our Matrix Factorization model and evaluate its performance:

cd algorithm
python mf.py

Key arguments:

  • --embedding-dim: Dimension of model embeddings (default: 232)
  • --alpha: Noise level for regularization (default: 0.05)
  • --batch-size: Training batch size (default: 2048)
  • --num-epochs: Number of training epochs (default: 50)
  • --eval-mode: Evaluation mode - 'correctness' or 'router' (default: 'correctness')

Citation

If you find this code useful for your research, please cite our paper:

@inproceedings{
zhuang2025embedllm,
title={Embed{LLM}: Learning Compact Representations of Large Language Models},
author={Richard Zhuang and Tianhao Wu and Zhaojin Wen and Andrew Li and Jiantao Jiao and Kannan Ramchandran},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=Fs9EabmQrJ}
}

About

Repo for EmbedLLM: Learning Compact Representations of Large Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages