-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Tmp
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
server
#12070
opened Feb 25, 2025 by
orca-zhang
Loading…
Cache based tokenization for the server input prompts
examples
server
#12067
opened Feb 25, 2025 by
vnicolici
Loading…
ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot
ggml
changes relating to the ggml tensor library for machine learning
#12064
opened Feb 25, 2025 by
Vithulep
Loading…
PR: Refine ggml-qnn backend(QNN, Qualcomm Neural Network,aka Qualcomm AI Engine Direct) for latest ggml,whisper.cpp,llama.cpp
ggml
changes relating to the ggml tensor library for machine learning
script
Script related
testing
Everything test related
#12049
opened Feb 24, 2025 by
zhouwg
Loading…
1 task done
Add GGML_HIP_ROCWMMA_FATTN to enable rocWMMA for FlashAttention
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#12032
opened Feb 22, 2025 by
hjc4869
Loading…
server webui easy config selection
demo
Demonstrate some concept or idea, not intended to be merged
examples
server
#12031
opened Feb 22, 2025 by
poulphunter
Loading…
CUDA: compress-mode size
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#12029
opened Feb 22, 2025 by
Green-Sky
Loading…
vulkan: matmul dequantization improvements
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12015
opened Feb 21, 2025 by
netrunnereve
Loading…
llama : add xcframework build script
devops
improvements to build systems and github actions
examples
#11996
opened Feb 21, 2025 by
danbev
Loading…
deepseek r1 series debug log warning fix and chat template support
testing
Everything test related
#11994
opened Feb 21, 2025 by
swordow
Loading…
CANN: Fix build error with GCC 13
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#11990
opened Feb 21, 2025 by
hipudding
Loading…
HIP: workaround runtime bug in hipGraph support
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#11964
opened Feb 19, 2025 by
IMbackK
Loading…
ggml-cpu: add arm64 CPU feature check for OpenBSD, FreeBSD
ggml
changes relating to the ggml tensor library for machine learning
#11939
opened Feb 18, 2025 by
brad0
Loading…
rpc: check op supporting
ggml
changes relating to the ggml tensor library for machine learning
#11923
opened Feb 17, 2025 by
thxCode
Loading…
Update ggml-backend.cpp
ggml
changes relating to the ggml tensor library for machine learning
#11916
opened Feb 17, 2025 by
hackhy
Loading…
Refactor gguf scripts to improve metadata handling
python
python script changes
#11909
opened Feb 16, 2025 by
CISC
Loading…
sampling: add Top-nσ sampler to
llama-server
examples
server
#11896
opened Feb 15, 2025 by
CasualAutopsy
Loading…
Overlap CUDA graph building and processing to minimize GPU idle time and improve tokens per seconds performance.
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#11867
opened Feb 14, 2025 by
aendk
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:master.