Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dependency issues with dreamvideo #143

Open
sburlce opened this issue Aug 5, 2024 · 1 comment
Open

dependency issues with dreamvideo #143

sburlce opened this issue Aug 5, 2024 · 1 comment

Comments

@sburlce
Copy link

sburlce commented Aug 5, 2024

!pip install --upgrade pip setuptools wheel

Update the package list

!apt-get update

Install build tools and other dependencies

!apt-get install -y build-essential cmake pkg-config
!apt-get install -y libjpeg-dev libtiff-dev libpng-dev
!pip install scikit-build
!wget https://files.pythonhosted.org/packages/source/o/opencv-python/opencv-python-4.4.0.46.tar.gz
!tar -xzf opencv-python-4.4.0.46.tar.gz
%cd opencv-python-4.4.0.46
!python setup.py install
%cd ..

Clone the specific version with submodules

!git clone --branch v0.0.13 --recurse-submodules https://github.com/facebookresearch/xformers.git

Navigate into the directory

%cd xformers

Install the package

!pip install .
%cd ..

claimed requirements.txt packages

!pip install diffusers
!pip install easydict==1.10
!pip install tokenizers==0.12.1
!pip install numpy>=1.19.2
!pip install ftfy==6.1.1
!pip install transformers==4.18.0
!pip install imageio==2.15.0
!pip install fairscale==0.4.6
!pip install ipdb
!pip install open-clip-torch==2.0.2
#!pip install xformers==0.0.13
!pip install chardet==5.1.0
!pip install torchdiffeq==0.2.3
#!pip install opencv-python==4.4.0.46
!pip install opencv-python-headless==4.7.0.68
!pip install torchsde==0.2.6
!pip install simplejson==3.18.4
!pip install motion-vector-extractor==1.0.6
!pip install scikit-learn
!pip install scikit-image
!pip install rotary-embedding-torch==0.2.1
!pip install pynvml==11.5.0
!pip install triton==2.0.0.dev20221120
!pip install pytorch-lightning==1.4.2
!pip install torchmetrics==0.6.0
!pip install gradio==3.39.0
!pip install imageio-ffmpeg
!pip install piq

i had issues with torchtext needing legacy

!pip install torch==1.13.0
!pip install torchvision==0.14.0
!pip install torchaudio==0.13.0
!pip install torchdata==0.5.1
!pip install torchtext==0.14.0

code that snagged

!python inference.py --cfg configs/i2vgen_xl_infer.yaml

error resulting

usr/local/lib/python3.10/dist-packages/xformers/_C.so: undefined symbol: _ZN3c104impl3cow11cow_deleterEPv
WARNING:root:WARNING: /usr/local/lib/python3.10/dist-packages/xformers/_C.so: undefined symbol: _ZN3c104impl3cow11cow_deleterEPv
Need to compile C++ extensions to get sparse attention suport. Please run python setup.py build develop
[2024-08-05 04:34:39,675] INFO: {'name': 'Config: VideoLDM Decoder', 'mean': [0.5, 0.5, 0.5], 'std': [0.5, 0.5, 0.5], 'max_words': 1000, 'num_workers': 6, 'prefetch_factor': 2, 'resolution': [1280, 704], 'vit_out_dim': 1024, 'vit_resolution': [224, 224], 'depth_clamp': 10.0, 'misc_size': 384, 'depth_std': 20.0, 'frame_lens': [16, 16, 16, 16, 16, 32, 32, 32], 'sample_fps': [8, 8, 16, 16, 16, 8, 16, 16], 'vid_dataset': {'type': 'VideoDataset', 'data_list': ['data/vid_list.txt'], 'max_words': 1000, 'resolution': [1280, 704], 'data_dir_list': ['data/videos/'], 'vit_resolution': [224, 224], 'get_first_frame': True}, 'img_dataset': {'type': 'ImageDataset', 'data_list': ['data/img_list.txt'], 'max_words': 1000, 'resolution': [1280, 704], 'data_dir_list': ['data/images'], 'vit_resolution': [224, 224]}, 'batch_sizes': {'1': 32, '4': 8, '8': 4, '16': 2, '32': 1}, 'Diffusion': {'type': 'DiffusionDDIM', 'schedule': 'cosine', 'schedule_param': {'num_timesteps': 1000, 'cosine_s': 0.008, 'zero_terminal_snr': True}, 'mean_type': 'v', 'loss_type': 'mse', 'var_type': 'fixed_small', 'rescale_timesteps': False, 'noise_strength': 0.1, 'ddim_timesteps': 50}, 'ddim_timesteps': 50, 'use_div_loss': False, 'p_zero': 0.0, 'guide_scale': 9.0, 'vit_mean': [0.48145466, 0.4578275, 0.40821073], 'vit_std': [0.26862954, 0.26130258, 0.27577711], 'sketch_mean': [0.485, 0.456, 0.406], 'sketch_std': [0.229, 0.224, 0.225], 'hist_sigma': 10.0, 'scale_factor': 0.18215, 'use_checkpoint': True, 'use_sharded_ddp': False, 'use_fsdp': False, 'use_fp16': True, 'temporal_attention': True, 'UNet': {'type': 'UNetSD_I2VGen', 'in_dim': 4, 'dim': 320, 'y_dim': 1024, 'context_dim': 1024, 'out_dim': 4, 'dim_mult': [1, 2, 4, 4], 'num_heads': 8, 'head_dim': 64, 'num_res_blocks': 2, 'attn_scales': [1.0, 0.5, 0.25], 'dropout': 0.1, 'temporal_attention': True, 'temporal_attn_times': 1, 'use_checkpoint': True, 'use_fps_condition': False, 'use_sim_mask': False, 'upper_len': 128, 'concat_dim': 4, 'default_fps': 8}, 'guidances': [], 'auto_encoder': {'type': 'AutoencoderKL', 'ddconfig': {'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0, 'video_kernel_size': [3, 1, 1]}, 'embed_dim': 4, 'pretrained': 'models/v2-1_512-ema-pruned.ckpt'}, 'embedder': {'type': 'FrozenOpenCLIPTextVisualEmbedder', 'layer': 'penultimate', 'pretrained': '/content/drive/MyDrive/Colab Notebooks/VGen/models/open_clip_pytorch_model.bin', 'vit_resolution': [224, 224]}, 'ema_decay': 0.9999, 'num_steps': 1000000, 'lr': 3e-05, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'eps': 1e-08, 'chunk_size': 2, 'decoder_bs': 2, 'alpha': 0.7, 'save_ckp_interval': 50, 'warmup_steps': 10, 'decay_mode': 'cosine', 'use_ema': True, 'load_from': None, 'Pretrain': {'type': 'pretrain_specific_strategies', 'fix_weight': False, 'grad_scale': 0.5, 'resume_checkpoint': '/content/drive/MyDrive/Colab Notebooks/VGen/models/i2vgen_xl_00854500.pth', 'sd_keys_path': '/content/drive/MyDrive/Colab Notebooks/VGen/models/stable_diffusion_image_key_temporal_attention_x1.json'}, 'viz_interval': 50, 'visual_train': {'type': 'VisualTrainTextImageToVideo', 'partial_keys': [['y', 'image', 'local_image', 'fps']], 'use_offset_noise': True, 'guide_scale': 9.0}, 'visual_inference': {'type': 'VisualGeneratedVideos'}, 'inference_list_path': '', 'log_interval': 1, 'log_dir': 'workspace/experiments/test_list_for_i2vgen', 'reward_type': 'HPSv2', 'temporal_reward_type': [], 'data_align_method': None, 'data_align_coef': 10, 'segments': 8, 'selection_method': 'fixed_first', 'exponential_TSN': True, 'lambda_TAR': 1.0, 'reward_normalization': False, 'positive_reward': False, 'partial_timestep': None, 'ddim_steps': [981, 961, 941, 921, 901, 881, 861, 841, 821, 801, 781, 761, 741, 721, 701, 681, 661, 641, 621, 601, 581, 561, 541, 521, 501, 481, 461, 441, 421, 401, 381, 361, 341, 321, 301, 281, 261, 241, 221, 201, 181, 161, 141, 121, 101, 81, 61, 41, 21, 1], 'motion_rep': None, 'low_penal_threshold': 0.05, 'reward_weights': {'reward': 1, 'reg': 1}, 'temp_dir': 'workspace/temp_dir', 'adv_clip_max': 5, 'ST_reward_weights': {'spatial': 1, 'temporal': 1}, 'seed': 8888, 'negative_prompt': 'Distorted, discontinuous, Ugly, blurry, low resolution, motionless, static, disfigured, disconnected limbs, Ugly faces, incomplete arms', 'ENABLE': True, 'DATASET': 'webvid10m', 'TASK_TYPE': 'inference_i2vgen_entrance', 'max_frames': 16, 'target_fps': 16, 'scale': 8, 'round': 4, 'batch_size': 1, 'use_zero_infer': True, 'vldm_cfg': 'configs/i2vgen_xl_train.yaml', 'test_list_path': 'data/test_list_for_i2vgen.txt', 'test_model': '/content/drive/MyDrive/Colab Notebooks/VGen/models/i2vgen_xl_00854500.pth', 'cfg_file': 'configs/i2vgen_xl_infer.yaml', 'init_method': 'tcp://localhost:9999', 'debug': False, 'opts': [], 'pmi_rank': 0, 'pmi_world_size': 1, 'gpus_per_machine': 1, 'world_size': 1, 'noise_strength': 0.1, 'gpu': 0, 'rank': 0, 'log_file': 'workspace/experiments/test_list_for_i2vgen/log_00.txt'}
[2024-08-05 04:34:39,677] INFO: Going into it2v_fullid_img_text inference on 0 gpu
[2024-08-05 04:34:39,691] INFO: Loading ViT-H-14 model config.
[2024-08-05 04:34:50,206] INFO: Loading pretrained ViT-H-14 weights (/content/drive/MyDrive/Colab Notebooks/VGen/models/open_clip_pytorch_model.bin).
Traceback (most recent call last):
File "/content/drive/MyDrive/Colab Notebooks/VGen/utils/registry.py", line 62, in build_from_config
return req_type_entry(**cfg)
File "/content/drive/MyDrive/Colab Notebooks/VGen/tools/modules/autoencoder.py", line 62, in init
self.init_from_ckpt(pretrained, ignore_keys=ignore_keys)
File "/content/drive/MyDrive/Colab Notebooks/VGen/tools/modules/autoencoder.py", line 65, in init_from_ckpt
sd = torch.load(path, map_location="cpu")["state_dict"]
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 789, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1131, in _load
result = unpickler.load()
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1124, in find_class
return super().find_class(mod_name, name)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/init.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/metrics/init.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/metrics/classification/init.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics, void
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/metrics/utils.py", line 29, in
from pytorch_lightning.utilities import rank_zero_deprecation
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/init.py", line 18, in
from pytorch_lightning.utilities.apply_func import move_data_to_device # noqa: F401
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/apply_func.py", line 31, in
from torchtext.legacy.data import Batch
ModuleNotFoundError: No module named 'torchtext.legacy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/content/drive/MyDrive/Colab Notebooks/VGen/utils/registry.py", line 67, in build_from_config
return req_type_entry(**cfg)
File "/content/drive/MyDrive/Colab Notebooks/VGen/tools/inferences/inference_i2vgen_entrance.py", line 74, in inference_i2vgen_entrance
worker(0, cfg, cfg_update)
File "/content/drive/MyDrive/Colab Notebooks/VGen/tools/inferences/inference_i2vgen_entrance.py", line 145, in worker
autoencoder = AUTO_ENCODER.build(cfg.auto_encoder)
File "/content/drive/MyDrive/Colab Notebooks/VGen/utils/registry.py", line 107, in build
return self.build_func(*args, **kwargs, registry=self)
File "/content/drive/MyDrive/Colab Notebooks/VGen/utils/registry_class.py", line 7, in build_func
return build_from_config(cfg, registry, **kwargs)
File "/content/drive/MyDrive/Colab Notebooks/VGen/utils/registry.py", line 64, in build_from_config
raise Exception(f"Failed to init class {req_type_entry}, with {e}")
Exception: Failed to init class <class 'tools.modules.autoencoder.AutoencoderKL'>, with No module named 'torchtext.legacy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/content/drive/MyDrive/Colab Notebooks/VGen/inference.py", line 18, in
INFER_ENGINE.build(dict(type=cfg_update.TASK_TYPE), cfg_update=cfg_update.cfg_dict)
File "/content/drive/MyDrive/Colab Notebooks/VGen/utils/registry.py", line 107, in build
return self.build_func(*args, **kwargs, registry=self)
File "/content/drive/MyDrive/Colab Notebooks/VGen/utils/registry_class.py", line 7, in build_func
return build_from_config(cfg, registry, **kwargs)
File "/content/drive/MyDrive/Colab Notebooks/VGen/utils/registry.py", line 69, in build_from_config
raise Exception(f"Failed to invoke function {req_type_entry}, with {e}")
Exception: Failed to invoke function <function inference_i2vgen_entrance at 0x7c4690bbb6d0>, with Failed to init class <class 'tools.modules.autoencoder.AutoencoderKL'>, with No module named 'torchtext.legacy'

I tried giving an earlier combo of torch audio vision data and text but then it tells me that the cuda version is not compatible. Any help getting this running on collab would be great.

@sburlce
Copy link
Author

sburlce commented Aug 10, 2024

It appears that torchlighting 1.4.2 uses the torchtext.legacy.data method, I am currently trying it with torchlightning 1.6.5 and will update if that works to resolve the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant