Skip to content

resize_token_embeddings in NLLB leading to empty outputs #32948

@bhavitvyamalik

Description

@bhavitvyamalik

System Info

  • transformers version: 4.42.3
  • Platform: Linux-4.18.0-513.11.1.el8_9.x86_64-x86_64-with-glibc2.28
  • Python version: 3.10.14
  • Huggingface_hub version: 0.23.4
  • Safetensors version: 0.4.3
  • Accelerate version: 0.33.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.3.1+cu121 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: yes

Who can help?

@ArthurZucker

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M", additional_special_tokens=[f"code_{i}" for i in range(18)], use_fast=True)
model.resize_token_embeddings(len(tokenizer))

After resizing, generation using an official example:

article = "Şeful ONU spune că nu există o soluţie militară în Siria"
inputs = tokenizer(article, return_tensors="pt")

translated_tokens = model.generate(
    **inputs, forced_bos_token_id=tokenizer.convert_tokens_to_ids("deu_Latn"), max_length=30
)
tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]

Output is: 't t t t t t t t t t'

Expected behavior

Generation should work without any errors. One interesting thing to note here is if I add just 2 new tokens, it works fine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Good Difficult IssueGood Second IssueIssues that are more difficult to do than "Good First" issues - give it a try if you want!bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions