Skip to content

Commit

Permalink
Update readme.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Dango233 authored Dec 15, 2024
1 parent c9bd1c1 commit 9c8598e
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions readme.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# ComfyUI wrapper nodes for [HunyuanVideo](https://github.com/Tencent/HunyuanVideo)

## WORK IN PROGRESS


# Experimental IP2V - Image Prompting to Video via VLM by @Dango233
## WORK IN PROGRESS - But it should work now!

NOTE:
- Minimum 20GB Vram required (VLM qualtization not implemented yet)
- This changes the original nodes behavior by @kijai quite a bit. So if you want to test this feature, please repoint your git to this branch and pull the updates, or simply delete the original repo and pull this one, before the PR got merged it the Kijai's repo.

NOTE: Minimum 20GB Vram required (VLM qualtization not implemented yet)

Now you can feed image to the VLM as condition of generations! This is different from image2video where the image become the first frame of the video. IP2V uses image as a part of the prompt, to extract the concept and style of the image.
So - very much like IPAdapter - but VLM will do the heavy lifting for you!
Expand All @@ -16,6 +18,8 @@ Now this is a tuning free approach but with further task specific tuning we can
<img src="examples/ip2v/example_output_with_workflow.png" height=256></img>
----



# Guide to Using `xtuner/llava-llama-3-8b-v1_1-transformers` for Image-Text Tasks

## Step 1: Model Selection
Expand Down

0 comments on commit 9c8598e

Please sign in to comment.