IST Austria Distributed Algorithms and Systems Lab

All

54 repositories

EvoPress
Public
Python
•1•17•0•0•Updated Feb 13, 2025Feb 13, 2025
QuEST
Public
Work in progress.
Jupyter Notebook
•
MIT License
•1•31•1•0•Updated Feb 13, 2025Feb 13, 2025
PanzaMail
Public
Python
•
Apache License 2.0
•15•278•5•5•Updated Feb 10, 2025Feb 10, 2025
ScalableMNN
Public
Official Repository for "Scalable Mechanistic Neural Networks" (ICLR 2025)
MIT License
•0•0•0•0•Updated Feb 6, 2025Feb 6, 2025
gemm-fp8
Public
High Performance FP8 GEMM Kernels for SM89 and later GPUs.
Cuda
•
MIT License
•0•3•0•0•Updated Jan 24, 2025Jan 24, 2025
GridSearcher
Public
GridSearcher simplifies running grid searches for machine learning projects in Python, emphasizing parallel execution and GPU scheduling without dependencies on SLURM or other workload managers.
Python
•
Apache License 2.0
•0•2•0•0•Updated Jan 23, 2025Jan 23, 2025
gemm-int8
Public
High Performance Int8 GEMM Kernels for SM80 and later GPUs.
Python
•
MIT License
•0•3•0•0•Updated Jan 15, 2025Jan 15, 2025
MicroAdam
Public
This repository contains code for the MicroAdam paper.
Python
•
Apache License 2.0
•4•16•1•0•Updated Dec 14, 2024Dec 14, 2024
llm-foundry
Public
LLM training code for Databricks foundation models
Python
•
Apache License 2.0
•543•0•0•1•Updated Nov 27, 2024Nov 27, 2024
marlin_artifact
Public
Python
•0•0•0•0•Updated Nov 25, 2024Nov 25, 2024
LDAdam-anonymous
Public
0•0•0•0•Updated Nov 20, 2024Nov 20, 2024
LDAdam
Public
LDAdam - Adaptive Optimization from Low-Dimensional Gradient Statistics
Python
•
Apache License 2.0
•0•6•0•0•Updated Nov 6, 2024Nov 6, 2024
torch_cgx
Public
Pytorch distributed backend extension with compression support
C++
•
GNU Affero General Public License v3.0
•0•16•4•0•Updated Oct 17, 2024Oct 17, 2024
ISTA-DASLab-Optimizers
Public
Python
•
Apache License 2.0
•0•7•0•0•Updated Sep 5, 2024Sep 5, 2024
Sparse-Marlin
Public
Boosting 4-bit inference kernels with 2:4 Sparsity
Cuda
•
Apache License 2.0
•4•64•1•0•Updated Sep 4, 2024Sep 4, 2024
marlin
Public
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
kernel quantization 4bit llm
Python
•
Apache License 2.0
•57•705•26•5•Updated Sep 4, 2024Sep 4, 2024
sparsegpt
Public
Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
Python
•
Apache License 2.0
•99•763•15•1•Updated Aug 20, 2024Aug 20, 2024
peft-rosa
Public
A fork of the PEFT library, supporting Robust Adaptation (RoSA)
Python
•
Apache License 2.0
•3•13•1•0•Updated Aug 16, 2024Aug 16, 2024
AutoGPTQRoSA
Public
Python
•
MIT License
•0•0•0•0•Updated Jun 27, 2024Jun 27, 2024
spops
Public
C++
•
Apache License 2.0
•0•7•2•0•Updated Jun 20, 2024Jun 20, 2024
Mathador-LM
Public
Code for the EMNLP 2024 paper "Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on LLMs".
Python
•
Apache License 2.0
•0•8•1•0•Updated Jun 18, 2024Jun 18, 2024
SPADE
Public
Code of SPADE: Sparsity Guided Debugging for Deep Neural Networks
Jupyter Notebook
•1•1•1•0•Updated May 25, 2024May 25, 2024
QUIK
Public
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024
C++
•
Apache License 2.0
•14•175•5•1•Updated Apr 16, 2024Apr 16, 2024
FastOBQ-
Public
GPTQ with finetuning
0•0•0•0•Updated Mar 27, 2024Mar 27, 2024
gptq
Public
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Python
•
Apache License 2.0
•164•2k•23•1•Updated Mar 27, 2024Mar 27, 2024
RoSA
Public
Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)
Python
•
Apache License 2.0
•3•38•1•0•Updated Feb 13, 2024Feb 13, 2024
SparseFinetuning
Public
Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
Python
•
Apache License 2.0
•6•40•3•0•Updated Jan 15, 2024Jan 15, 2024
CAP
Public
Repository for Correlation Aware Prune (NeurIPS23) source and experimental code
Python
•
Apache License 2.0
•1•5•1•0•Updated Nov 29, 2023Nov 29, 2023
qmoe
Public
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
Python
•
Apache License 2.0
•22•265•3•0•Updated Nov 3, 2023Nov 3, 2023
ZipLM
Public
Code for the NeurIPS 2023 paper: "ZipLM: Inference-Aware Structured Pruning of Language Models".
0•2•1•0•Updated Oct 20, 2023Oct 20, 2023