Skip to content

Issues: ggerganov/llama.cpp

changelog : libllama API
#9289 opened Sep 3, 2024 by ggerganov
Open 1
changelog : llama-server REST API
#9291 opened Sep 3, 2024 by ggerganov
Open 2
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

metal : increase GPU duty-cycle during inference Apple Metal https://en.wikipedia.org/wiki/Metal_(API) help wanted Extra attention is needed performance Speed related topics
#9507 opened Sep 16, 2024 by ggerganov
Refactor: Add more typechecking to GGUFWriter.add_key_value help wanted Extra attention is needed refactoring Refactoring
#9095 opened Aug 19, 2024 by mofosyne
Improve cvector-generator enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
#8724 opened Jul 27, 2024 by ngxson
Feature Request: Installable package via winget enhancement New feature or request help wanted Extra attention is needed
#8188 opened Jun 28, 2024 by ngxson
4 tasks done
ggml : add WebGPU backend help wanted Extra attention is needed research 🔬
#7773 opened Jun 5, 2024 by ggerganov
ggml : add DirectML backend help wanted Extra attention is needed research 🔬
#7772 opened Jun 5, 2024 by ggerganov
Refactor: Existing examples refactoring opportunities help wanted Extra attention is needed refactoring Refactoring
#7559 opened May 27, 2024 by mofosyne
3 tasks
Fix self extend on the server. examples help wanted Extra attention is needed Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix server
#7239 opened May 12, 2024 by Maximilian-Winter Draft
Add token healing to main and server enhancement New feature or request examples help wanted Extra attention is needed need feedback Testing and feedback with results are needed Review Complexity : High Generally require indepth knowledge of LLMs or GPUs server
#7187 opened May 9, 2024 by mare5x Loading…
Should we add an autolabeler for PR? devops improvements to build systems and github actions enhancement New feature or request help wanted Extra attention is needed
#7174 opened May 9, 2024 by mofosyne
ggml : add GPU support for Mamba models enhancement New feature or request help wanted Extra attention is needed Nvidia GPU Issues specific to Nvidia GPUs
#6758 opened Apr 19, 2024 by ggerganov
Inference not running when using a tokenizer with word model bug Something isn't working help wanted Extra attention is needed
#6717 opened Apr 17, 2024 by AI-Guru
support --hf-token param in addition of --hf-repo enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
#6613 opened Apr 11, 2024 by phymbert
server: process prompt fairly accross slots enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed server/webui
#6607 opened Apr 11, 2024 by phymbert
Server: Add prompt processing progress endpoint? enhancement New feature or request help wanted Extra attention is needed server/webui
#6586 opened Apr 10, 2024 by stduhpf
kubernetes example enhancement New feature or request help wanted Extra attention is needed kubernetes Helm & Kubernetes server/webui
#6546 opened Apr 8, 2024 by phymbert
common: download from URL, improve parallel download progress status enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed split GGUF split model sharding
#6537 opened Apr 8, 2024 by phymbert
Question: How to generate an MPS gputrace help wanted Extra attention is needed high priority Very important issue
#6506 opened Apr 5, 2024 by tomsanbear
Feature Request: Task Cancellation on Client Disconnection enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed server/webui
#6421 opened Apr 1, 2024 by redlion0929
4 tasks done
server: doc: document the --defrag-thold option documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed server/webui
#6293 opened Mar 25, 2024 by phymbert
split: include the option in ./convert.py and quantize enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed split GGUF split model sharding
#6260 opened Mar 23, 2024 by phymbert
split: allow --split-max-size option enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed split GGUF split model sharding
#6259 opened Mar 23, 2024 by phymbert
llava-cli: improve llava-cli and the API for using LLaVA enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed llava LLaVa and multimodal
#6027 opened Mar 12, 2024 by phymbert
WIP: Add model merge example demo Demonstrate some concept or idea, not intended to be merged help wanted Extra attention is needed
#5741 opened Feb 26, 2024 by ngxson Draft
ProTip! Exclude everything labeled bug with -label:bug.