koboldcpp-1.70.1
koboldcpp-1.70.1
mom: we have ChatGPT at home edition
- Updated Kobold Lite:
- Introducting Corpo Mode: A new beginner friendly UI theme that aims to emulate the ChatGPT look and feel closely, providing a clean, simple and minimalistic interface. It has a limited feature set compared to other UI themes, but should feel very familiar and intuitive for new users. Now available for instruct mode!
- Settings Menu Rework: The settings menu has also been completely overhauled into 4 distinct panels, and should feel a lot less cramped now, especially on desktop.
- Sampler Presets and Instruct Presets have been updated and modernized.
- Added support for importing character cards from aicharactercards.com
- Added copy for code blocks
- Added support for dedicated System Tag and System Prompt (you are still encouraged to use the Memory feature instead)
- Improved accessibility, keyboard tab navigation and screen reader support
- NEW: Official releases now provide windows binaries with included AVX1 CUDA support, download
koboldcpp_oldcpu.exe
- NEW: DRY dynamic N-gram anti-repetition sampler support has been added (credits @pi6am)
- Added
--unpack
, a new self-extraction feature that allows KoboldCpp binary releases to be unpacked into an empty directory. This allows easy modification and access to the files and contents embedded inside the PyInstaller. Can also be used in the GUI launcher. - Fix for a Vulkan regression in Q4_K_S mistral models when offloading to GPU (thanks @0cc4m).
- Experimental support for OpenAI tools and function calling API (credits @teddybear082)
- Added a workaround for Deepseek crashing due to unicode decoding issues.
--chatcompletionsadapter
can now be selected on included pre-bundled templates by filename, e.g.Llama-3.json
, pre-bundled templates have also been updated for correctness (thanks @xzuyn).- Default
--contextsize
is finally increased to 4096, default Chat Completions API output length is also increased. - Merged fixes and improvements from upstream, including multiple Gemma fixes.
1.70.1: Fixed a bug with --unpack
not including the py files, with the oldcpu binary missing some options, and swapped the cu11 linux binary to not use avx2 for best compatibility. The cu12 linux binary still uses avx2 for max performance.
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster).
If you're using Linux, select the appropriate Linux binary file instead (not exe).
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here
Run it from the command line with the desired launch parameters (see --help
), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help
flag.