Skip to content

fix: crash on bge-m3 embedding model #8883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

thxCode
Copy link
Contributor

@thxCode thxCode commented Aug 6, 2024

crash on testing llama-server embedding with bge-m3.

  1. the first commit is to align the n_ubatch to n_batch if the model is non-causal, like what embedding: adjust n_ubatch value, print error on insufficient n_batch value #6296 did.
  2. the SPM vocabulary checks whether has the linefeed_id, and uses special_pad_id instead if not found. the second commit is to respect the special_pad_id of metadata.
  3. panic as the following if the char byte is not found: libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found. use special_unk_id if not found.

I am unsure if there are other corner cases, please let me know.

thxCode added 3 commits August 6, 2024 17:17
Signed-off-by: thxCode <thxcode0824@gmail.com>
when vocab.type is SPM, we will confirm the linefeed_id by searching the
char, and use special_pad_id instead if not found.

the special_*_id are usually record in metadata, to ensure the
special_pad_id can be used correctly, we need to obtain it from metadata
first and then perform the above confirmation logic.

Signed-off-by: thxCode <thxcode0824@gmail.com>
Signed-off-by: thxCode <thxcode0824@gmail.com>
@ExtReMLapin
Copy link
Contributor

Btw it's still not faster than transformers so why use it ?

Comment on lines +282 to +288
llama_vocab::id token_id;
try {
token_id = llama_byte_to_token_impl(vocab, symbol.text[j]);
} catch(const std::exception & e) {
// not found, use UNK token instead.
token_id = vocab.special_unk_id;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about this change - if this happened, wouldn't it imply a problem with the model / tokenizer? Seems better to find and fix the root of the problem instead of hiding it

Copy link
Contributor Author

@thxCode thxCode Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it. this fix is inspired by https://github.com/ggerganov/llama.cpp/blame/1e6f6554aa11fa10160a5fda689e736c3c34169f/src/llama.cpp#L5560-L5565, maybe my understanding is not correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggerganov should I close this PR if the last commit is not a reasonable change? thanks.

@mofosyne mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants