Closed
Description
Describe the bug
If the process exits while the embedding model is being downloaded/extracted, the cache may be left in a corrupted state leading to warnings/errors such as the following:
⚠ Local embedding not supported in browser, falling back to remote embedding
"error": {
"message": "The model `BGE-small-en-v1.5` does not exist or you do not have access to it.",
"type": "invalid_request_error",
"param": null,
"code": "model_not_found"
}
The warning above is logged because of an error thrown from FlagEmbedding.init(...)
in embedding.ts
(this error is not logged in the console without some code changes):
Error: Tokenizer file not found at /home/ubuntu/eliza/cache/fast-bge-small-en-v1.5/tokenizer.json
at FlagEmbedding.loadTokenizer (/home/ubuntu/eliza/node_modules/fastembed/lib/cjs/fastembed.js:139:19)
at FlagEmbedding.<anonymous> (/home/ubuntu/eliza/node_modules/fastembed/lib/cjs/fastembed.js:124:36)
at Generator.next (<anonymous>)
at fulfilled (/home/ubuntu/eliza/node_modules/fastembed/lib/cjs/fastembed.js:28:58)
Without manually clearing the cache
folder in which the embedding models are stored, the error will always occur.
To Reproduce
I'm not exactly sure how to consistently reproduce the corrupted cache with "normal" operation but you can do the following:
- Delete
cache/fast-bge-small-en-v1.5/tokenizer.json
- Start an agent using local embeddings
Expected behavior
pnpm clean
should resolve any cache related issues.
Screenshots
Additional context
Activity