Skip to content

Corrupt model cache #1506

Closed
Closed
@timolegros

Description

Describe the bug

If the process exits while the embedding model is being downloaded/extracted, the cache may be left in a corrupted state leading to warnings/errors such as the following:

⚠ Local embedding not supported in browser, falling back to remote embedding

"error": {
          "message": "The model `BGE-small-en-v1.5` does not exist or you do not have access to it.",
          "type": "invalid_request_error",
          "param": null,
          "code": "model_not_found"
      }

The warning above is logged because of an error thrown from FlagEmbedding.init(...) in embedding.ts (this error is not logged in the console without some code changes):

Error: Tokenizer file not found at /home/ubuntu/eliza/cache/fast-bge-small-en-v1.5/tokenizer.json
    at FlagEmbedding.loadTokenizer (/home/ubuntu/eliza/node_modules/fastembed/lib/cjs/fastembed.js:139:19)
    at FlagEmbedding.<anonymous> (/home/ubuntu/eliza/node_modules/fastembed/lib/cjs/fastembed.js:124:36)
    at Generator.next (<anonymous>)
    at fulfilled (/home/ubuntu/eliza/node_modules/fastembed/lib/cjs/fastembed.js:28:58)

Without manually clearing the cache folder in which the embedding models are stored, the error will always occur.

To Reproduce

I'm not exactly sure how to consistently reproduce the corrupted cache with "normal" operation but you can do the following:

  1. Delete cache/fast-bge-small-en-v1.5/tokenizer.json
  2. Start an agent using local embeddings

Expected behavior

pnpm clean should resolve any cache related issues.

Screenshots

Additional context

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions