Skip to content

Error no file named pytorch_model.bin, model.safetensors found in directory Lightricks/LTX-Video. #10321

Closed
@nitinmukesh

Description

@nitinmukesh

Describe the bug

(venv) C:\ai1\LTX-Video>python inference.py
Traceback (most recent call last):
  File "C:\ai1\LTX-Video\inference.py", line 23, in <module>
    text_encoder = T5EncoderModel.from_pretrained(
  File "C:\ai1\LTX-Video\venv\lib\site-packages\transformers\modeling_utils.py", line 3779, in from_pretrained
    raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory Lightricks/LTX-Video.

(venv) C:\ai1\LTX-Video>python inference.py
Traceback (most recent call last):
  File "C:\ai1\LTX-Video\inference.py", line 23, in <module>
    text_encoder = T5EncoderModel.from_pretrained(
  File "C:\ai1\LTX-Video\venv\lib\site-packages\transformers\modeling_utils.py", line 3779, in from_pretrained
    raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory Lightricks/LTX-Video.

Reproduction

Install diffusers from source and use the code mentioned here
https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video

Logs

C:\ai1\LTX-Video\Lightricks>tree /F
Folder PATH listing for volume Windows-SSD
Volume serial number is CE9F-A6AE
C:.
└───LTX-Video
    │   ltx-video-2b-v0.9.1.safetensors
    │   model_index.json
    │
    ├───text_encoder
    │       config.json
    │       model-00001-of-00004.safetensors
    │       model-00002-of-00004.safetensors
    │       model-00003-of-00004.safetensors
    │       model-00004-of-00004.safetensors
    │
    ├───tokenizer
    │       added_tokens.json
    │       special_tokens_map.json
    │       spiece.model
    │       tokenizer_config.json
    │
    ├───transformer
    │       config.json
    │       diffusion_pytorch_model-00001-of-00002.safetensors
    │       diffusion_pytorch_model-00002-of-00002.safetensors
    │       diffusion_pytorch_model.safetensors.index.json
    │
    └───vae
            config.json
            diffusion_pytorch_model.safetensors

System Info

Windows 11/ Python 3.10.11

(venv) C:\ai1\LTX-Video>pip list
Package            Version
------------------ ------------
accelerate         1.2.1
certifi            2024.12.14
charset-normalizer 3.4.0
colorama           0.4.6
diffusers          0.32.0.dev0
einops             0.8.0
filelock           3.16.1
fsspec             2024.12.0
gguf               0.13.0
huggingface-hub    0.25.2
idna               3.10
importlib_metadata 8.5.0
Jinja2             3.1.4
MarkupSafe         3.0.2
mpmath             1.3.0
networkx           3.4.2
numpy              2.2.0
packaging          24.2
pillow             11.0.0
pip                23.0.1
psutil             6.1.1
PyYAML             6.0.2
regex              2024.11.6
requests           2.32.3
safetensors        0.4.5
sentencepiece      0.2.0
setuptools         65.5.0
sympy              1.13.1
tokenizers         0.21.0
torch              2.5.1+cu124
torchvision        0.20.1+cu124
tqdm               4.67.1
transformers       4.47.1
typing_extensions  4.12.2
urllib3            2.2.3
wheel              0.45.1
zipp               3.21.0

Who can help?

import torch
from diffusers import LTXPipeline
from transformers import T5EncoderModel, T5Tokenizer

single_file_url = "Lightricks/LTX-Video/ltx-video-2b-v0.9.1.safetensors"
text_encoder = T5EncoderModel.from_pretrained(
  "Lightricks/LTX-Video", subfolder="text_encoder", torch_dtype=torch.bfloat16
)
tokenizer = T5Tokenizer.from_pretrained(
  "Lightricks/LTX-Video", subfolder="tokenizer", torch_dtype=torch.bfloat16
)
pipe = LTXPipeline.from_single_file(
  single_file_url, text_encoder=text_encoder, tokenizer=tokenizer, torch_dtype=torch.bfloat16
)


pipe.enable_model_cpu_offload()
prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output_ltx.mp4", fps=24)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions