Skip to content

Running SD-XL with from_single_file() does not work without accelerate #4213

Closed
@float-trip

Description

@float-trip

Describe the bug

I don't actually know if this is specific to from_single_file() or not. This error occurred with a fresh install of python -m pip install git+https://github.com/huggingface/diffusers.

(sd) root@5fdf95ca5d58:/workspace/sd# python run.py
Traceback (most recent call last):
  File "/workspace/sd/run.py", line 7, in <module>
    pipe = StableDiffusionXLPipeline.from_single_file(
  File "/workspace/miniconda/envs/sd/lib/python3.10/site-packages/diffusers/loaders.py", line 1513, in from_single_file
    pipe = download_from_original_stable_diffusion_ckpt(
  File "/workspace/miniconda/envs/sd/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py", line 1526, in download_from_original_stable_diffusion_ckpt
    text_encoder = convert_ldm_clip_checkpoint(checkpoint, local_files_only=local_files_only)
  File "/workspace/miniconda/envs/sd/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py", line 802, in convert_ldm_clip_checkpoint
    text_model.load_state_dict(text_model_dict)
  File "/workspace/miniconda/envs/sd/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for CLIPTextModel:
	Unexpected key(s) in state_dict: "text_model.embeddings.position_ids".

I saw this line was where the error was occurring, installed accelerate, and the same code then worked.

Reproduction

from diffusers import StableDiffusionXLPipeline
import torch

pipe = StableDiffusionXLPipeline.from_single_file(
    "sd_xl_base_0.9.safetensors", torch_dtype=torch.float16, use_safetensors=True, variant="fp16"
)
pipe.to("cuda")

prompt = "an astronaut in the jungle"
image = pipe(prompt=prompt).images[0]

Logs

No response

System Info

  • diffusers version: 0.19.0.dev0
  • Platform: Linux-5.15.0-71-generic-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.0.1+cu118 (True)
  • Huggingface_hub version: 0.16.4
  • Transformers version: 4.32.0.dev0
  • Accelerate version: not installed
  • xFormers version: not installed
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Who can help?

@patrickvonplaten


Edit - two minor notes about the documentation:

It currently says to install accelerate, but my understanding is that's no longer meant to be necessary. It also doesn't mention that you'll need to pip install omegaconf.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions