Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible to use LLM online API? #237

Open
SunnyPai0413 opened this issue Dec 29, 2024 · 4 comments
Open

Possible to use LLM online API? #237

SunnyPai0413 opened this issue Dec 29, 2024 · 4 comments

Comments

@SunnyPai0413
Copy link

Hi HunyuanVideoWrapper Team,
Sorry for disturbing by opening an issue, but I am not able to find the discussion selections of this repository.
I saw a llava-llama-3-8b-text-encoder-tokenizer when loading the encoder. As for my low vram (P100-16G), it takes me for 0.5hrs to run the TextEncode step and just 2min for video sampler. I really wonder that can I use online api instead to reduce the vram usage, so that I can save 90% of the time when running a batch generation?
Thank you again for your contributing, and answering our questions. Wish you have a nice day!
SunnyPai

@DocShotgun
Copy link

I'm curious what resolution/frames/steps settings you are using that would only require 2 minutes for the video sampler. For me, the time spent during generation is probably like 95% on video sampling, 4% on VAE decoding, and <1% on text encoding.

@kurttu4
Copy link

kurttu4 commented Dec 30, 2024

maybe make support for quantum versions of the GGuf model https://huggingface.co/IbnAbdeen/llava-llama-3-8b-text-encoder-tokenizer-Q4_K_M-GGUF

@SunnyPai0413
Copy link
Author

I'm curious what resolution/frames/steps settings you are using that would only require 2 minutes for the video sampler. For me, the time spent during generation is probably like 95% on video sampling, 4% on VAE decoding, and <1% on text encoding.

just using with 96*160 resolution, 30 steps, 21frames/9fps

@SunnyPai0413
Copy link
Author

maybe make support for quantum versions of the GGuf model https://huggingface.co/IbnAbdeen/llava-llama-3-8b-text-encoder-tokenizer-Q4_K_M-GGUF

Thank you. I'll have a try for that Q4 LLM. I'm trying to make a model for movements generation, and planning to build a dataset generated by this model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants