Closed
Description
Hey folks, I'm trying to use the deepseek-coder-1.3b-base model with bumblebee. I was delighted to find that the model, tokenizer and generation_config all load. But when trying to run inference I get the following error that's a bit hard for me to debug:
repo = {:hf, "deepseek-ai/deepseek-coder-1.3b-base"}
{:ok, model_info} = Bumblebee.load_model(repo, backend: {EXLA.Backend, client: :host})
{:ok, tokenizer} = Bumblebee.load_tokenizer(repo)
{:ok, generation_config} = Bumblebee.load_generation_config(repo)
serving = Bumblebee.Text.generation(model_info, tokenizer, generation_config)
prompt = "hello world"
Nx.Serving.run(serving, prompt)
** (ErlangError) Erlang error: "Could not decode field on position 1"
(tokenizers 0.4.0) Tokenizers.Native.encoding_transform(#Tokenizers.Encoding<[length: 2, ids: [31702, 1835]]>, [pad: {2, [pad_id: nil, pad_token: "</s>", direction: :left]}])
(elixir 1.15.7) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
(bumblebee 0.4.2) lib/bumblebee/utils/tokenizers.ex:51: Bumblebee.Utils.Tokenizers.apply/4
(nx 0.6.2) lib/nx.ex:4510: Nx.with_default_backend/2
(bumblebee 0.4.2) lib/bumblebee/text/generation.ex:882: anonymous fn/4 in Bumblebee.Text.Generation.generation/4
(nx 0.6.2) lib/nx/serving.ex:1704: anonymous fn/3 in Nx.Serving.handle_preprocessing/2
(telemetry 1.2.1) /Users/jonas/Library/Caches/mix/installs/elixir-1.15.7-erts-14.1.1/f67c01eefcd351fd5b5511a96e61c42d/deps/telemetry/src/telemetry.erl:321: :telemetry.span/3
#cell:776q3ifvc2hexaoavrvlcde7ehfkvusl:7: (file)
I'm using bumblebee 0.4.2
Here's the model spec
spec: %Bumblebee.Text.Llama{
architecture: :for_causal_language_modeling,
vocab_size: 32256,
max_positions: 16384,
hidden_size: 2048,
intermediate_size: 5504,
num_blocks: 24,
num_attention_heads: 16,
activation: :silu,
layer_norm_epsilon: 1.0e-6,
initializer_scale: 0.02,
output_hidden_states: false,
output_attentions: false,
num_labels: 2,
id_to_label: %{},
pad_token_id: 0
}
And here's the tokenizer
%Bumblebee.Text.LlamaTokenizer{
tokenizer: #Tokenizers.Tokenizer<[
vocab_size: 32022,
byte_fallback: false,
continuing_subword_prefix: nil,
dropout: nil,
end_of_word_suffix: nil,
fuse_unk: false,
model_type: "bpe",
unk_token: nil
]>,
special_tokens: %{pad: "</s>", eos: "</s>", sep: "</s>", unk: "<unk>"},
additional_special_tokens: []
}
It looks like the vocab size is not correct in the model spec, for example.
I think the tokenizer uses the correct vocabulary, because I can run this:
Bumblebee.Tokenizer.decode(tokenizer, [32015])
and it correctly returns <|fim▁hole|> , which is a deepseek specific token
Would be amazing if this model was supported, as deepseek-coder actually seems to be pretty good at elixir out of the box 🙇
Thank you so much for your help!
Metadata
Metadata
Assignees
Labels
No labels