koboldcpp-1.58
koboldcpp-1.58
- Added a toggle for row split mode with cuda multigpu. Split mode changed to layer split by default. If using command line, add
rowsplit
to--usecublas
to enable row split mode. With the GUI launcher, it's a checkbox toggle. - Multiple bugfixes: fixed benchmark command, fixed SSL streaming issues, fixed some SSE formatting with OAI endpoints.
- Make context shifting more forgiving when determining eligibility.
- Upgraded CLBlast to latest version, should result in a modest prompt processing speedup when using CL.
- Various improvements and bugfixes merged from upstream.
- Updated Kobold Lite with many improvements and new features:
- New: Integrated 'AI Vision' for images, this uses AI Horde or a local A1111 endpoint to perform image interrogation, allowing the AI to recognize and interpret uploaded or generated images. This should provide an option for multimodality similar to llava, although not as precise. Click on any image and you can enable it within Lite. This functionality is not provided by KCPP itself.
- New: Importing characters from Pygmalion.Chat is now supported in Lite, select it from scenarios.
- Added option to run Lite in background. It plays a dynamically generated silent audio sound. This should prevent the browser tab from hibernating.
- Fixed printable view, persist streaming text on error, fixed instruct timestamps
- Added "Auto" option for idle responses.
- Allow importing images into story from local disk
- Multiple minor formatting and bug fixes.
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here
Run it from the command line with the desired launch parameters (see --help
), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help
flag.