Skip to content

Hangs on input longer than 8192 characters #4

Closed
@simonw

Description

@simonw

I tried running this in the https://github.com/simonw/hmb-map folder (to get all the node_models READMEs):

llm embed-multi jina-readmes \
      -m jina-embeddings-v2-small-en \
      --files . '**/README.md' --store \
      --database test.db

I got this:

Embedding [####################################] 100%Token indices sequence length is longer than the specified maximum sequence length for this model (8367 > 8192). Running this sequence through the model will result in indexing errors
/Users/simon/.cache/huggingface/modules/transformers_modules/jinaai/jina-bert-implementation/a9db86227f71a0bd7bc05e5dda0359f1e09abb0f/modeling_bert.py:774: UserWarning: Increasing alibi size from 8192 to 8367.
warnings.warn(

The process seemed to hang - I had to Ctrl+Z and then kill %1 to exit it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions