Skip to content

[Liger] ImportError: cannot import name '_CONFIG_FOR_DOC' from 'transformers.models.gemma.modeling_gemma' #3480

Closed
@lewtun

Description

@lewtun

Reproduction

Due to a refactor in transformers (huggingface/transformers#33771), the latest stable release of transformers (4.52.2) and liger-kernel (0.5.9) currently doesn't work in SFTTrainer when use_liger_kernel=True is set in SFTConfig. Command to repro:

python trl/scripts/sft.py \
    --model_name_or_path Qwen/Qwen2-0.5B \
    --dataset_name trl-lib/Capybara \
    --learning_rate 2.0e-5 \
    --num_train_epochs 1 \
    --packing \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 8 \
    --gradient_checkpointing \
    --eos_token '<|im_end|>' \
    --logging_steps 25 \
    --eval_strategy steps \
    --eval_steps 100 \
    --output_dir Qwen2-0.5B-SFT \
    --use_liger_kernel

outputs:

Traceback (most recent call last):
  File "/fsx/lewis/git/hf/trl/trl/scripts/sft.py", line 149, in <module>
    main(script_args, training_args, model_args)
  File "/fsx/lewis/git/hf/trl/trl/scripts/sft.py", line 117, in main
    trainer = SFTTrainer(
              ^^^^^^^^^^^
  File "/fsx/lewis/git/hf/trl/trl/trainer/sft_trainer.py", line 385, in __init__
    super().__init__(
  File "/fsx/lewis/git/hf/trl/trl-env/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/trl/trl-env/lib/python3.11/site-packages/transformers/trainer.py", line 531, in __init__
    from liger_kernel.transformers import _apply_liger_kernel_to_instance
  File "/fsx/lewis/git/hf/trl/trl-env/lib/python3.11/site-packages/liger_kernel/transformers/__init__.py", line 97, in __getattr__
    module = importlib.import_module("liger_kernel.transformers.monkey_patch")
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/admin/home/lewis/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/trl/trl-env/lib/python3.11/site-packages/liger_kernel/transformers/monkey_patch.py", line 16, in <module>
    from liger_kernel.transformers.model.gemma import lce_forward as gemma_lce_forward
  File "/fsx/lewis/git/hf/trl/trl-env/lib/python3.11/site-packages/liger_kernel/transformers/model/gemma.py", line 11, in <module>
    from transformers.models.gemma.modeling_gemma import _CONFIG_FOR_DOC
ImportError: cannot import name '_CONFIG_FOR_DOC' from 'transformers.models.gemma.modeling_gemma' (/fsx/lewis/git/hf/trl/trl-env/lib/python3.11/site-packages/transformers/models/gemma/modeling_gemma.py)

This has been fixed on liger-kernel@main in linkedin/Liger-Kernel#712 so a current workaround is to install from source via

pip install git+https://github.com/linkedin/Liger-Kernel.git

When the next version of liger-kernel is published, we should bump our lower bound in the trl dependencies cc @kashif for viz

System Info

  • Platform: Linux-5.15.0-1048-aws-x86_64-with-glibc2.31
  • Python version: 3.11.11
  • TRL version: 0.18.0.dev0+a528b9c
  • PyTorch version: 2.6.0
  • accelerator(s): NVIDIA H100 80GB HBM3
  • Transformers version: 4.52.2
  • Accelerate version: 1.7.0.dev0
  • Accelerate config: not found
  • Datasets version: 3.5.0
  • HF Hub version: 0.30.2
  • bitsandbytes version: 0.45.5
  • DeepSpeed version: 0.16.6
  • Diffusers version: 0.32.2
  • Liger-Kernel version: 0.5.9
  • LLM-Blender version: 0.0.2
  • OpenAI version: 1.75.0
  • PEFT version: 0.15.2
  • vLLM version: 0.8.4

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete

Metadata

Metadata

Assignees

No one assigned

    Labels

    🏋 SFTRelated to SFT🐛 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions