Bug: WARNING: The BPE pre-tokenizer was not recognized!

### What happened?

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:**          There are 2 possible reasons for this:
WARNING:hf-to-gguf:**          - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:**          - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:**          Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref:     https://github.com/ggerganov/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh:  8e62295832751ca1e8f92f2226f403dea30dc5165e448b5bfa05af5340c64ec7
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
  File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 4430, in <module>
    main()
  File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 4424, in main
    model_instance.write()
  File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 434, in write
    self.prepare_metadata(vocab_only=False)
  File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 427, in prepare_metadata
    self.set_vocab()
  File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 2554, in set_vocab
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 515, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
  File "/root/llama.cpp-master/convert_hf_to_gguf.py", line 671, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

### Name and Version

python convert_hf_to_gguf.py /data/model/BAAI/bge-large-zh-v1.5/ --outfile text2vec-base-chinese.gguf --model-name bert-bge

### What operating system are you seeing the problem on?

_No response_

### Relevant log output

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: WARNING: The BPE pre-tokenizer was not recognized! #9927

What happened?

Name and Version

What operating system are you seeing the problem on?

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: WARNING: The BPE pre-tokenizer was not recognized! #9927

Description

What happened?

Name and Version

What operating system are you seeing the problem on?

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions