15 Mar 18:38

rifkybujana

978f73c

v0.1.7 Latest

Latest

What's Changed

Fixed installer to clean install the application, avoiding weird bugs
Fixed application and server crashed on large prompt
Added control of maximum number of tokens to be processed on each iteration frame
Fixed chat name can't contain some symbols
Allowing to rename duplicate chat name
Fixed pasted long text on system prompt crashed the application
Acrylic background
Refactor AI model config
Added section of downloaded model
Sort model list based on alphabet
Added search in model manager model
Gemma 3 support!

Full Changelog: v0.1.6...v0.1.7

Assets 4

08 Mar 18:08

rifkybujana

v0.1.6

678b4ca

v0.1.6

Introducing Kolosal AI Server, easily managed server within kolosal AI's application.
Added phi 4 and phi 4 mini models.
Added continous batching mechanism for decoding
Added kv cache management mechanism for batch decoding
Added model loading settings within server tabs
Added tab management system
Added automatic title generation for each chat history

Assets 4

13 Feb 16:35

rifkybujana

v0.1.5.1

c330cec

v0.1.5

What's Changed

Context shifting with StreamingLLM (https://arxiv.org/abs/2309.17453) to unlimited generation
Llimit max context to 4096 to make it more memory efficient and faster
Added stop generation
Added regenerate button
Redesign progress bar
Model loading handled asyncrhonously
Added unload model button
Huge refactor
Fix code block rendering glitches
Setting max new tokens to 0 will result in unlimited generation with context shifting
Fix application crash when delete a chat

Full Changelog: v0.1.4.1...v0.1.5

Assets 4

01 Feb 14:00

rifkybujana

v0.1.4.1

d7f162f

v0.1.4

added deepseek r1 support
added markdown rendering
added tps stat
added cancel download button
added delete model button
fixed model duplication issue
fix engine memory leak
added thinking UI
add automation to detect number of thread to use
fix last selected model issue
add fallback on loading model failed

Assets 4

22 Jan 04:26

rifkybujana

v0.1.3

ecaa341

v0.1.3

New feature

Added persistence KV Cache method: Persistence KV Cache allow Kolosal to have model kv cache state saved for each chat history, making processing previous chat to be instant.

Bug fixing

Fix model parameter is not correctly passed to the model
Fix deleting a chat, crashed the application
Fix switching model resulting the model to generate in different chat
Fix does not detect AMD GPU
Fix EOS not detected on a finetuned model with chatml format
Fix force Close in chat feature
Fix performance issue on GPU

New model

qwen 2.5 code 0.5b - 14b
qwen 2.5 14b

Assets 4

12 Jan 09:33

rifkybujana

v0.1.2

41aac36

v0.1.2

What's Changed

Fix GPU support, now can detect automatically nvidia/amd gpu in your device and select it
Added clear chat and delete chat buttons
Fix application shortcut (removed fn + left arrow key short cut to open kolosal)
Added Qwen2.5 models 0.5 - 7B

Full Changelog: v0.1.0...v0.1.2

Assets 4

09 Jan 06:05

rifkybujana

v0.1.1

b505e86

v0.1.1

What's Changed

Added Windows Installer
Added Sahabat AI Llama 3 8B
Added Sahabat AI Gemma 2 9B
Added Gemma 2 2B
Added Gemma 2 9B
Added Llama 3.1 8B
Added 8bit quantization support
Update quantization selection UI to be radio button

Full Changelog: v0.1...v0.1.1

Assets 4

08 Jan 13:24

rifkybujana

v0.1.0

b505e86

v0.1.0

Kolosal AI 0.1 marks the very first release of our groundbreaking solution for on-device Large Language Model (LLM) inference. Engineered to run smoothly on both CPUs and a variety of GPUs, Kolosal AI brings powerful AI capabilities to Windows 64-bit systems without relying on external servers or cloud dependencies.

Key Features and Highlights:

On-Device Inference: Harness the power of advanced LLMs locally on Windows 64-bit machines, preserving data privacy and reducing latency.
Broad Hardware Support: Optimize performance on most common CPUs and GPUs, making Kolosal AI accessible and efficient for a wide range of hardware configurations.
Easy Integration: Seamlessly incorporate Kolosal AI’s inference engine into existing applications or workflows with straightforward setup and minimal dependencies.
Low Latency & High Throughput: Experience near real-time responses through efficient model optimization and hardware utilization.

With Kolosal AI 0.1, developers and enthusiasts alike can begin exploring the possibilities of large-scale AI right from their desktop—no cloud required. This release is just the start of our mission to empower everyone with the latest advancements in artificial intelligence, all within a user-friendly and hardware-agnostic platform.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

What's Changed

New feature

Bug fixing

New model

What's Changed

What's Changed

Releases: Genta-Technology/Kolosal

v0.1.7

What's Changed

v0.1.6

v0.1.5

What's Changed

v0.1.4

v0.1.3

New feature

Bug fixing

New model

v0.1.2

What's Changed

v0.1.1

What's Changed

v0.1.0