Skip to content

Issues: ggerganov/llama.cpp

examples : add configuration presets
#10932 opened Dec 21, 2024 by ggerganov
Open 3
changelog : libllama API
#9289 opened Sep 3, 2024 by ggerganov
Open 1
changelog : llama-server REST API
#9291 opened Sep 3, 2024 by ggerganov
Open 11
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

llama/ggml: add LLM training support examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs Review Complexity : High Generally require indepth knowledge of LLMs or GPUs testing Everything test related
#10544 opened Nov 27, 2024 by JohannesGaessler Loading…
ggml: skip excess iteration for pair whose vars same element when i2 == i1 ggml changes relating to the ggml tensor library for machine learning Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#9177 opened Aug 25, 2024 by GermanAizek Loading…
2 of 4 tasks
ggml : make GeLU faster and more accurate on CPU ggml changes relating to the ggml tensor library for machine learning Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#8878 opened Aug 5, 2024 by jart Loading…
Add support for loongarch backend in sgemm.cpp Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#8726 opened Jul 27, 2024 by Tianzhengshuyuan Loading…
llama : support Jamba hybrid Transformer-Mamba models android Issues specific to Android embeddings embedding related topics enhancement New feature or request examples ggml changes relating to the ggml tensor library for machine learning model Model specific need feedback Testing and feedback with results are needed python python script changes refactoring Refactoring Review Complexity : High Generally require indepth knowledge of LLMs or GPUs server
#7531 opened May 25, 2024 by compilade Draft
7 of 17 tasks
Introduce Q8_0 and Q4_0 with Bf16 delta values examples ggml changes relating to the ggml tensor library for machine learning python python script changes Review Complexity : High Generally require indepth knowledge of LLMs or GPUs Tensor Encoding Scheme https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes
#7497 opened May 23, 2024 by Srihari-mcw Loading…
Add token healing to main and server enhancement New feature or request examples help wanted Extra attention is needed need feedback Testing and feedback with results are needed Review Complexity : High Generally require indepth knowledge of LLMs or GPUs server
#7187 opened May 9, 2024 by mare5x Loading…
Fix flash attention for ROCm enhancement New feature or request Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#7011 opened Apr 30, 2024 by jdecourval Draft
support MiniCPM-V-2 demo Demonstrate some concept or idea, not intended to be merged enhancement New feature or request examples python python script changes Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#6919 opened Apr 26, 2024 by Achazwl Loading…
cuda : use amd wave sharing intrinsics for warp_reduce functions performance Speed related topics Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#6522 opened Apr 7, 2024 by Engininja2 Loading…
Adding Support for Custom Qwen2moe Architectures with mergekit-qwen2 model Model specific Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#6453 opened Apr 3, 2024 by DisOOM Draft
Smooth Sampling / Quadratic Sampling support generation quality Quality of model output performance Speed related topics Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#6445 opened Apr 2, 2024 by kalomaze Loading…
Xeon Phi (Knights Corner) Support. enhancement New feature or request ggml changes relating to the ggml tensor library for machine learning performance Speed related topics Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#6440 opened Apr 2, 2024 by julialongtin Loading…
Fix IQ1_S quantization bugfix fixes an issue or bug Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#6287 opened Mar 24, 2024 by CISC Draft
llama : compute BERT graph with F16 K, V demo Demonstrate some concept or idea, not intended to be merged Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#5891 opened Mar 5, 2024 by ggerganov Loading…
llama : switch to floating-point token positions demo Demonstrate some concept or idea, not intended to be merged refactoring Refactoring Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#5679 opened Feb 23, 2024 by ggerganov Draft
P-Step Truncation Sampling generation quality Quality of model output need feedback Testing and feedback with results are needed refactoring Refactoring Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#5675 opened Feb 23, 2024 by p-e-w Loading…
[RFC] common, server : add top-a sampler enhancement New feature or request generation quality Quality of model output Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#5612 opened Feb 20, 2024 by Artefact2 Loading…
cuda: 1.2x faster dequantization kernel performance Speed related topics Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#2809 opened Aug 26, 2023 by li-plus Loading…
Q4_0 scale selection using RMSE enhancement New feature or request Less than 4 bits Efforts related to viable quantized models using <4 bits research 🔬 Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#835 opened Apr 7, 2023 by sw Draft
ProTip! Find all open issues with in progress development work with linked:pr.