-
Notifications
You must be signed in to change notification settings - Fork 9.7k
Issues: ggerganov/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Add Intel Advanced Matrix Extensions (AMX) support to ggml
build
Compilation issues
ggml
changes relating to the ggml tensor library for machine learning
performance
Speed related topics
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7707
opened Jun 3, 2024 by
mingfeima
Loading…
Introduce ggml_syncthreads()
performance
Speed related topics
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7455
opened May 22, 2024 by
jart
Loading…
sched : support async weight copy
performance
Speed related topics
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
The MLX Challenge
performance
Speed related topics
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
cuda : use amd wave sharing intrinsics for warp_reduce functions
performance
Speed related topics
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
#6522
opened Apr 7, 2024 by
Engininja2
Loading…
Smooth Sampling / Quadratic Sampling support
generation quality
Quality of model output
performance
Speed related topics
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
#6445
opened Apr 2, 2024 by
kalomaze
Loading…
Xeon Phi (Knights Corner) Support.
enhancement
New feature or request
ggml
changes relating to the ggml tensor library for machine learning
performance
Speed related topics
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
#6440
opened Apr 2, 2024 by
julialongtin
Loading…
Fuse matrix multiplication + SiLU
performance
Speed related topics
refactoring
Refactoring
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#5413
opened Feb 8, 2024 by
JohannesGaessler
•
Draft
llama : speed-up grammar sampling
performance
Speed related topics
refactoring
Refactoring
#4218
opened Nov 25, 2023 by
ggerganov
metal : compile-time kernel args and params
performance
Speed related topics
research 🔬
#4085
opened Nov 15, 2023 by
ggerganov
metal: template for mat-vec multiplication kernels
performance
Speed related topics
#2891
opened Aug 30, 2023 by
lshzh-ww
Loading…
cuda: 1.2x faster dequantization kernel
performance
Speed related topics
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
#2809
opened Aug 26, 2023 by
li-plus
Loading…
Support CoreML like whisper.cpp?
help wanted
Extra attention is needed
macos
Issues specific to macOS
performance
Speed related topics
#1714
opened Jun 6, 2023 by
realcarlos
ProTip!
What’s not been updated in a month: updated:<2024-10-11.