-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
- Flux with torchao int8wo not working
- enable_sequential_cpu_offload not working
Reproduction
example taken from (merged)
#10009
import torch
from diffusers import FluxPipeline, FluxTransformer2DModel, TorchAoConfig
model_id = "black-forest-labs/Flux.1-Dev"
dtype = torch.bfloat16
quantization_config = TorchAoConfig("int8wo")
transformer = FluxTransformer2DModel.from_pretrained(
model_id,
subfolder="transformer",
quantization_config=quantization_config,
torch_dtype=dtype,
)
pipe = FluxPipeline.from_pretrained(
model_id,
transformer=transformer,
torch_dtype=dtype,
)
# pipe.to("cuda")
# pipe.enable_sequential_cpu_offload()
pipe.vae.enable_tiling()
prompt = "A cat holding a sign that says hello world"
image = pipe(prompt, num_inference_steps=4, guidance_scale=0.0).images[0]
image.save("output.png")
Logs
Stuck at this (without cpu offload)
(venv) C:\ai1\diffuser_t2i>python FLUX_torchao.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████| 3/3 [00:00<?, ?it/s]
Loading pipeline components...: 29%|████████▊ | 2/7 [00:00<00:00, 5.36it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 6.86it/s]
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:02<00:00, 2.38it/s](with cpu offload)
(venv) C:\ai1\diffuser_t2i>python FLUX_torchao.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████| 3/3 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 6.98it/s]
Loading pipeline components...: 29%|████████▊ | 2/7 [00:00<00:01, 2.62it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:01<00:00, 4.31it/s]
Traceback (most recent call last):
File "C:\ai1\diffuser_t2i\FLUX_torchao.py", line 21, in <module>
pipe.enable_sequential_cpu_offload()
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1179, in enable_sequential_cpu_offload
cpu_offload(model, device, offload_buffers=offload_buffers)
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\big_modeling.py", line 205, in cpu_offload
attach_align_device_hook(
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
attach_align_device_hook(
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
attach_align_device_hook(
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
attach_align_device_hook(
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 509, in attach_align_device_hook
add_hook_to_module(module, hook, append=True)
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 161, in add_hook_to_module
module = hook.init_hook(module)
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 308, in init_hook
set_module_tensor_to_device(module, name, "meta")
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\utils\modeling.py", line 355, in set_module_tensor_to_device
new_value.layout_tensor,
AttributeError: 'AffineQuantizedTensor' object has no attribute 'layout_tensor'
System Info
Windows 11
(venv) C:\ai1\diffuser_t2i>python --version
Python 3.10.11
(venv) C:\ai1\diffuser_t2i>echo %CUDA_PATH%
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6
(venv) C:\ai1\diffuser_t2i>pip list
Package Version
------------------ ------------
accelerate 1.2.0.dev0
aiofiles 23.2.1
annotated-types 0.7.0
anyio 4.7.0
bitsandbytes 0.45.0
certifi 2024.12.14
charset-normalizer 3.4.1
click 8.1.8
colorama 0.4.6
diffusers 0.33.0.dev0
einops 0.8.0
exceptiongroup 1.2.2
fastapi 0.115.6
ffmpy 0.5.0
filelock 3.16.1
fsspec 2024.12.0
gguf 0.13.0
gradio 5.9.1
gradio_client 1.5.2
h11 0.14.0
httpcore 1.0.7
httpx 0.28.1
huggingface-hub 0.25.2
idna 3.10
imageio 2.36.1
imageio-ffmpeg 0.5.1
importlib_metadata 8.5.0
Jinja2 3.1.5
markdown-it-py 3.0.0
MarkupSafe 2.1.5
mdurl 0.1.2
mpmath 1.3.0
networkx 3.4.2
ninja 1.11.1.3
numpy 2.2.1
opencv-python 4.10.0.84
orjson 3.10.13
packaging 24.2
pandas 2.2.3
pillow 11.1.0
pip 23.0.1
protobuf 5.29.2
psutil 6.1.1
pydantic 2.10.4
pydantic_core 2.27.2
pydub 0.25.1
Pygments 2.18.0
python-dateutil 2.9.0.post0
python-multipart 0.0.20
pytz 2024.2
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
rich 13.9.4
ruff 0.8.6
safehttpx 0.1.6
safetensors 0.5.0
semantic-version 2.10.0
sentencepiece 0.2.0
setuptools 65.5.0
shellingham 1.5.4
six 1.17.0
sniffio 1.3.1
starlette 0.41.3
sympy 1.13.1
tokenizers 0.21.0
tomlkit 0.13.2
torch 2.5.1+cu124
torchao 0.7.0
torchvision 0.20.1+cu124
tqdm 4.67.1
transformers 4.47.1
typer 0.15.1
typing_extensions 4.12.2
tzdata 2024.2
urllib3 2.3.0
uvicorn 0.34.0
websockets 14.1
wheel 0.45.1
zipp 3.21.0
Who can help?
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
