-
Notifications
You must be signed in to change notification settings - Fork 9.4k
Issues: ggerganov/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
metal : increase GPU duty-cycle during inference
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
help wanted
Extra attention is needed
performance
Speed related topics
#9507
opened Sep 16, 2024 by
ggerganov
Refactor: Add more typechecking to GGUFWriter.add_key_value
help wanted
Extra attention is needed
refactoring
Refactoring
#9095
opened Aug 19, 2024 by
mofosyne
Improve New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
cvector-generator
enhancement
#8724
opened Jul 27, 2024 by
ngxson
Feature Request: Installable package via winget
enhancement
New feature or request
help wanted
Extra attention is needed
#8188
opened Jun 28, 2024 by
ngxson
4 tasks done
ggml : add WebGPU backend
help wanted
Extra attention is needed
research 🔬
#7773
opened Jun 5, 2024 by
ggerganov
ggml : add DirectML backend
help wanted
Extra attention is needed
research 🔬
#7772
opened Jun 5, 2024 by
ggerganov
Refactor: Existing examples refactoring opportunities
help wanted
Extra attention is needed
refactoring
Refactoring
#7559
opened May 27, 2024 by
mofosyne
3 tasks
Fix self extend on the server.
examples
help wanted
Extra attention is needed
Review Complexity : Low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
server
#7239
opened May 12, 2024 by
Maximilian-Winter
•
Draft
Add token healing to New feature or request
examples
help wanted
Extra attention is needed
need feedback
Testing and feedback with results are needed
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
server
main
and server
enhancement
#7187
opened May 9, 2024 by
mare5x
Loading…
Should we add an autolabeler for PR?
devops
improvements to build systems and github actions
enhancement
New feature or request
help wanted
Extra attention is needed
#7174
opened May 9, 2024 by
mofosyne
ggml : add GPU support for Mamba models
enhancement
New feature or request
help wanted
Extra attention is needed
Nvidia GPU
Issues specific to Nvidia GPUs
#6758
opened Apr 19, 2024 by
ggerganov
Inference not running when using a tokenizer with word model
bug
Something isn't working
help wanted
Extra attention is needed
#6717
opened Apr 17, 2024 by
AI-Guru
support New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
--hf-token
param in addition of --hf-repo
enhancement
#6613
opened Apr 11, 2024 by
phymbert
server: process prompt fairly accross slots
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
server/webui
#6607
opened Apr 11, 2024 by
phymbert
Server: Add prompt processing progress endpoint?
enhancement
New feature or request
help wanted
Extra attention is needed
server/webui
#6586
opened Apr 10, 2024 by
stduhpf
kubernetes example
enhancement
New feature or request
help wanted
Extra attention is needed
kubernetes
Helm & Kubernetes
server/webui
#6546
opened Apr 8, 2024 by
phymbert
common: download from URL, improve parallel download progress status
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
split
GGUF split model sharding
#6537
opened Apr 8, 2024 by
phymbert
Question: How to generate an MPS gputrace
help wanted
Extra attention is needed
high priority
Very important issue
#6506
opened Apr 5, 2024 by
tomsanbear
Feature Request: Task Cancellation on Client Disconnection
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
server/webui
#6421
opened Apr 1, 2024 by
redlion0929
4 tasks done
server: doc: document the Improvements or additions to documentation
enhancement
New feature or request
help wanted
Extra attention is needed
server/webui
--defrag-thold
option
documentation
#6293
opened Mar 25, 2024 by
phymbert
server: exit failure if New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
server/webui
--embedding
is set with an incoherent --ubatch-size
enhancement
#6263
opened Mar 23, 2024 by
phymbert
split: include the option in ./convert.py and quantize
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
split
GGUF split model sharding
#6260
opened Mar 23, 2024 by
phymbert
split: allow --split-max-size option
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
split
GGUF split model sharding
#6259
opened Mar 23, 2024 by
phymbert
llava-cli: improve llava-cli and the API for using LLaVA
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
llava
LLaVa and multimodal
#6027
opened Mar 12, 2024 by
phymbert
WIP: Add model Demonstrate some concept or idea, not intended to be merged
help wanted
Extra attention is needed
merge
example
demo
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.