Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# koboldcpp-experimental

KoboldCpp-experimental is a sligthly extended KoboldCpp with [custom](experimental/README.md) functionality.

# koboldcpp

KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original **KoboldAI**. It's a single self-contained distributable from Concedo, that builds off llama.cpp, and adds a versatile **KoboldAI API endpoint**, additional format support, Stable Diffusion image generation, speech-to-text, backward compatibility, as well as a fancy UI with persistent stories, editing tools, save formats, memory, world info, author's note, characters, scenarios and everything KoboldAI and KoboldAI Lite have to offer.
Expand Down
101 changes: 101 additions & 0 deletions experimental/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# KoboldCpp Experimental

## Emphasisfsm

The common problem during text generiation are misplaced emphasis characters.

*looks at you "why* this is here?"

while it should be

*looks at you* "why this is here?"

This emphasisfsm solves this by simple (and fast) grammar expressed by deterministic finite state machine.

![Letters](emphasis-dfsm-letters.png)

Single letters are not practical in LLMs as tokens often contains more than one.

Emphasisfsm uses LLM tokens as its alphabet making it very fast.

![Tokens](emphasis-dfsm-tokens.png)

Those are only most obvious examples. There are more, eg. ' "***' is a valid token to transition from qout to star. and '*this' is vaild for quot->none or none->quot.

### Usage

To support variety of GUIs this extension shamefully exploits GBNF grammar string. *This is not a proper GBNF grammar, it only uses the field which is easily editable in most GUIs*

![KoboldCpp hack](gbnf-kob.png) ![SillyTavern hack](gbnf-st.png)


emphasisfsm "_bias_[D][_emph1_][,_emphn_]"

Empty string emphasisfsm is disabled. The easiest way to enable is to

emphasisfsm "-20"

which defaults to

emphasisfsm "-20 \" \" * *"

(no debug, only * and " are considered)


### how it works

Main loop is extended from:

- retrieve logits
- sample logits, select token (top_k and friends)
- output token

to

- retrieve logits
- ban forbidden emphasisfsm transitions from current state (stetting their logits low)
- sample logits, select token (top_k and friends)
- emphasisfsm trasition on selected token
- output token


### TODO

- find split utf8 letters over more than one token (i don't plant to support it, but warning would be nice)
- banning end tokens generation inside of emphasis - forcing LLM to finsh his 'thought' ?


### Meta-Llama-3-8B stats for default (" *) emphasisfsm

empcats_gen: ban bias: -17.500000
empcats_gen: emphasis indifferent tokens: 126802
empcats_gen: tokens for emphasis '"' '"': 1137
empcats_gen: tokens for emphasis '*' '*': 315
empcats_gen: always banned tokens: 2
empcats_gen: total tokens: 128256

Always banned tokens are :

<pre>' "*"', ' "*"'</pre>

### Tests

emphasisfsm "-20 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 0 0"

This forces that every digit is a citation, so example text completion looks like:


```
Give me math vector of random numbers.Here is a 3-dimensional math vector with random numbers:


Vector:
[
3.445,
-5.117,
7.992
]
```

There is no other digit between two 3, two 4, two 5 and so on....

18 changes: 18 additions & 0 deletions experimental/emphasis-dfsm-letters.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
digraph Letters {

rankdir=LR;
node [shape = circle, pos="0,0!"];
quot [shape = circle, pos="2,1!"];
star [shape = circle, pos="2,-1!"];
none -> none [label = "not\n (\" or *)"];
none -> quot [label = "\""];
none -> star [label = "*"];

quot -> none [label = "\""];
star -> none [label = "*"];

star -> star [label = "not\n (\" or *)"]
quot -> quot [label = "not\n (\" or *)"]


}
Binary file added experimental/emphasis-dfsm-letters.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions experimental/emphasis-dfsm-tokens.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
digraph Tokens {

node [shape = circle, pos="0,0!"];
quot [shape = circle, pos="2,1!"];
star [shape = circle, pos="2,-1!"];
none -> none [label = "not\n (\" or *)"];
none -> quot [label = "\""];
none -> star [label = "*"];

quot -> none [label = "\""];
star -> none [label = "*"];

star -> star [label = "not\n (\" or *)"]
quot -> quot [label = "not\n (\" or *)"]


star -> quot [label = "*\""]
quot -> star [label = "\"*"]

}
Binary file added experimental/emphasis-dfsm-tokens.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading