Use embeddings instead of keywords for World Info search #223
Replies: 1 comment 2 replies
-
May be an interesting option for world info, but memory is always applied so I don't see the application there. Not sure if sentence embeddings would be the best choice for representing world info keys, given that those are named entities (usually singular nouns) and sentences are, well, sentences. Maybe sentence embedding models account for this, but if not it'd probably be best to see if there's a model that can output confidence values for the presence of certain named entities. It miiiiiiiggghht also be possible to utilize model token encodings (if we can strip positional encodings) for zero overhead but that's pretty out there. Whatever the case, it'd be an cool alternative to keyword matching, but shouldn't replace it as it'd probably have substantial slowdowns with Dynamic WI (as we rescan the whole context after each token is generated) and on lower-end devices that use Kobold as a client for online/distributed services |
Beta Was this translation helpful? Give feedback.
-
Hi, I want to suggest using embeddings for semantic search for World info and possibly Memory. It should be pretty straightforward and I would do it myself on a forked version, but when I took a look into the code I realized how large it is and how long it would take me to wrap my head around it.
Here's some code on how the embeddings can be created:
These can be stored as .json files and loaded into a pandas DataFrame to perform cosine similarity.
df in this case is a pandas DataFrame with 'text' and 'embedding' columns.
Beta Was this translation helpful? Give feedback.
All reactions