-
Notifications
You must be signed in to change notification settings - Fork 12.1k
fix: crash on bge-m3 embedding model #8883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: thxCode <thxcode0824@gmail.com>
when vocab.type is SPM, we will confirm the linefeed_id by searching the char, and use special_pad_id instead if not found. the special_*_id are usually record in metadata, to ensure the special_pad_id can be used correctly, we need to obtain it from metadata first and then perform the above confirmation logic. Signed-off-by: thxCode <thxcode0824@gmail.com>
Signed-off-by: thxCode <thxcode0824@gmail.com>
Btw it's still not faster than transformers so why use it ? |
llama_vocab::id token_id; | ||
try { | ||
token_id = llama_byte_to_token_impl(vocab, symbol.text[j]); | ||
} catch(const std::exception & e) { | ||
// not found, use UNK token instead. | ||
token_id = vocab.special_unk_id; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about this change - if this happened, wouldn't it imply a problem with the model / tokenizer? Seems better to find and fix the root of the problem instead of hiding it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get it. this fix is inspired by https://github.com/ggerganov/llama.cpp/blame/1e6f6554aa11fa10160a5fda689e736c3c34169f/src/llama.cpp#L5560-L5565, maybe my understanding is not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggerganov should I close this PR if the last commit is not a reasonable change? thanks.
crash on testing llama-server embedding with bge-m3.
n_ubatch
value, print error on insufficientn_batch
value #6296 did.linefeed_id
, and usesspecial_pad_id
instead if not found. the second commit is to respect thespecial_pad_id
of metadata.libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found
. usespecial_unk_id
if not found.I am unsure if there are other corner cases, please let me know.