abetlen / llama-cpp-python Public

Notifications You must be signed in to change notification settings
Fork 1.1k
Star 9.1k

Code
Issues 550
Pull requests 81
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: abetlen/llama-cpp-python

Roadmap for v0.2

#487 opened Jul 18, 2023 by abetlen

Open 1

Add batched inference

#771 opened Sep 30, 2023 by abetlen

Open 37

Improve installation process

#1178 opened Feb 12, 2024 by abetlen

Open 8

Labels 24 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

69 Open 100 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Feature request: ability to tokenize a list of strings _or_ keep the tokenizer warm enhancement

New feature or request

#1763 opened Sep 25, 2024 by lsorber

Add support T5 (encode-decoder) models at API level and server enhancement

New feature or request

#1681 opened Aug 14, 2024 by fabiomatricardi

Feat: Add support for Llama 3.1 function calling enhancement

New feature or request

#1618 opened Jul 24, 2024 by qnixsynapse

Add support for croos-encoders enhancement

New feature or request

#1611 opened Jul 20, 2024 by perpendicularai

Pull from Ollama repo functionality enhancement

New feature or request

#1607 opened Jul 18, 2024 by ericcurtin

How to log raw token generation? enhancement

New feature or request

#1546 opened Jun 21, 2024 by sisi399

Multi-arch support for pre-built cpu wheel enhancement

New feature or request

#1506 opened Jun 5, 2024 by abetlen

Improve pre-built wheel CI times by only building llama.cpp once per platform enhancement

New feature or request

#1505 opened Jun 5, 2024 by abetlen

Include usage key in create_completion when streaming enhancement

New feature or request

#1498 opened May 30, 2024 by zhudotexe

Please add response_format to create_completion enhancement

New feature or request

#1478 opened May 23, 2024 by dtkettler

Add support for auto setting n_gpu_layers from gguf and available vram size enhancement

New feature or request

#1456 opened May 14, 2024 by abetlen

arm64 builds for CUDA enhancement

New feature or request

#1446 opened May 10, 2024 by mcvella

Add Nous Hermes 2 Pro function calling ChatHandler. enhancement

New feature or request

#1429 opened May 5, 2024 by stygmate

Is loading of control vectors supported? enhancement

New feature or request

#1363 opened Apr 19, 2024 by edwinRNDR

[REQUEST] Accept raw token IDs in stop parameter enhancement

New feature or request

#1360 opened Apr 18, 2024 by ddh0

Hermes 2 Pro Full Chat Format Support enhancement

New feature or request

#1339 opened Apr 9, 2024 by abetlen

Allow any format for X-Request-Id enhancement

New feature or request

#1337 opened Apr 9, 2024 by ging-dev

4 tasks done

Models with multiple chat templates enhancement

New feature or request

#1336 opened Apr 8, 2024 by CISC

Add command-r support like llamacpp has enhancement

New feature or request

#1279 opened Mar 16, 2024 by rombodawg

Does this lib support contrastive search decoding ? enhancement

New feature or request

#1253 opened Mar 5, 2024 by congson1293

Add Self-Extend support? enhancement

New feature or request

#1242 opened Mar 1, 2024 by theaerotoad

llama_cpp.server save chat log enhancement

New feature or request

#1224 opened Feb 26, 2024 by riverzhou

[Implement Optimization] Skip Inference for Predefined Tokens in Response Formatting enhancement

New feature or request

#1203 opened Feb 21, 2024 by Garstig

Improve installation process enhancement

New feature or request

help wanted

Extra attention is needed

#1178 opened Feb 12, 2024 by abetlen

Have you thought about adding quantum cache or 8 bit cache? enhancement

New feature or request

#1161 opened Feb 5, 2024 by Ph0rk0z

Previous 1 2 3 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2025-04-25.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!