-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WebGPU] Check batch_compute_passes before sending passes when not doing GPU profiling
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
#23457
opened May 21, 2026 by
nikhilJain17
Contributor
Loading…
hexagon: apply repl optimization in flash attn softmax as #22993
ggml
changes relating to the ggml tensor library for machine learning
Hexagon
#23455
opened May 21, 2026 by
njsyw1997
Contributor
Loading…
server: expose prompt token counts in /slots endpoint
examples
server
#23454
opened May 21, 2026 by
ScrewTSW
Loading…
Generalize Adreno MoE kernels on size M
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#23449
opened May 20, 2026 by
shawngu-quic
Contributor
Loading…
app: re-inject subcommand when router spawns children under unified binary
examples
server
#23442
opened May 20, 2026 by
ServeurpersoCom
Contributor
Loading…
Hip fattn expf approx
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#23441
opened May 20, 2026 by
a-huk
Loading…
MoE disk offloading for Metal
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#23440
opened May 20, 2026 by
kisasexypantera94
Loading…
ggml/cpu: skip zero-scale blocks in TQ1_0 and TQ2_0 vec_dot kernels
ggml
changes relating to the ggml tensor library for machine learning
#23439
opened May 20, 2026 by
eriirfos-eng
Loading…
json-schema-to-grammar: expand PCRE shorthands in pattern strings
testing
Everything test related
#23436
opened May 20, 2026 by
iOptimizeThings
Loading…
doc: fix spec mtp typo
documentation
Improvements or additions to documentation
#23435
opened May 20, 2026 by
ruixiang63
Loading…
mtp: use inp_out_ids for skipping logit computation
model
Model specific
#23433
opened May 20, 2026 by
am17an
Contributor
Loading…
ui: simplify network error handling
examples
server/ui
#23431
opened May 20, 2026 by
socram8888
Contributor
Loading…
removed unecesary mmproj download when users pass --no-mmproj
#23425
opened May 20, 2026 by
ryan-mangeno
Contributor
Loading…
ggml : add GGML_OP_COL2IM_1D (CPU + CUDA)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#23424
opened May 20, 2026 by
ServeurpersoCom
Contributor
Loading…
vulkan: add Flash Attention support for BFloat16 KV cache.
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#23420
opened May 20, 2026 by
0cc4m
Contributor
Loading…
ggml-zendnn : add Q8_0 quantization support
AMD ZenDNN
Issues related to the AMD ZenDNN backend
ggml
changes relating to the ggml tensor library for machine learning
#23414
opened May 20, 2026 by
z-sachin
Contributor
Loading…
metal : optimize concat kernel and fix set kernel threads
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
vocab : add Carbon-3B (HybridDNATokenizer) support
python
python script changes
#23410
opened May 20, 2026 by
kashif
Loading…
fix: skip main_gpu validation when no gpus are available
#23405
opened May 20, 2026 by
Dev-iL
Loading…
Update documentation with Granite 4.0/4.1
documentation
Improvements or additions to documentation
#23404
opened May 20, 2026 by
jesus-talavera-ibm
Contributor
Loading…
ui: Improve Git Hooks for UI development
examples
server/ui
#23403
opened May 20, 2026 by
allozaur
Contributor
Loading…
ggml-cpu:Optimized risc-v cpu nvfp4
ggml
changes relating to the ggml tensor library for machine learning
#23402
opened May 20, 2026 by
ixgbe
Contributor
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.