-
Notifications
You must be signed in to change notification settings - Fork 26.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Video Llava #29733
Merged
Merged
Add Video Llava #29733
Changes from 1 commit
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
dce6678
add model draft
zucchini-nlp 72626df
update docstring
zucchini-nlp 8cca731
add tests
zucchini-nlp 4ea4f70
support image and video as input
zucchini-nlp c36819d
update for better handling of mixed input and clean-up a bit
zucchini-nlp c1a8fd5
bug when mixed inputs & add tests
zucchini-nlp c591c75
Update README.md
zucchini-nlp 5ff8d18
Merge remote-tracking branch 'upstream/main' into video_llava
zucchini-nlp a6bc68d
link to abstract of paper in README
zucchini-nlp eb309ed
fix test
zucchini-nlp 2f46f6c
fix-copies
zucchini-nlp 6b51b7e
Merge branch 'main' into video_llava
zucchini-nlp e112958
make tests happy
zucchini-nlp 5cb6163
skip docstest for now
zucchini-nlp 930147d
do not run doctest for now
zucchini-nlp 24ec2b3
Merge remote-tracking branch 'upstream/main' into video_llava
zucchini-nlp 142bfc0
Update src/transformers/models/video_llava/processing_video_llava.py
zucchini-nlp fdec895
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp e83251c
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp 4fcfe72
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp 327030d
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp 33289a5
Update tests/models/video_llava/test_modeling_video_llava.py
zucchini-nlp dfef75a
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp ebf1042
address review comments
zucchini-nlp aa1b278
failing tests
zucchini-nlp 7802922
Fix vocab_size in common tests for VLMs
zucchini-nlp 9fce414
codestyle
zucchini-nlp e8b4569
Merge branch 'huggingface:main' into video_llava
zucchini-nlp bb1cc26
Update src/transformers/models/video_llava/configuration_video_llava.py
zucchini-nlp e2e92b2
Update src/transformers/models/video_llava/configuration_video_llava.py
zucchini-nlp 5c77fff
Update src/transformers/models/video_llava/modeling_video_llava.py
zucchini-nlp 99518cb
Update src/transformers/models/video_llava/modeling_video_llava.py
zucchini-nlp 451fd72
Update docs/source/en/model_doc/video_llava.md
zucchini-nlp 95a9a01
Update docs/source/en/model_doc/video_llava.md
zucchini-nlp 347fa8c
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp 3e2f1b4
Update docs/source/en/model_doc/video_llava.md
zucchini-nlp 3cd1222
Update src/transformers/models/video_llava/processing_video_llava.py
zucchini-nlp 242703a
Update tests/models/video_llava/test_modeling_video_llava.py
zucchini-nlp 9c1a10d
Update tests/models/video_llava/test_modeling_video_llava.py
zucchini-nlp b4145e1
Update tests/models/video_llava/test_modeling_video_llava.py
zucchini-nlp 5803d5a
PR suggestions
zucchini-nlp 975d959
fix-copies
zucchini-nlp 7f30e3b
Merge branch 'main' into video_llava
zucchini-nlp 6bdad81
Merge branch 'huggingface:main' into video_llava
zucchini-nlp a817f31
Update src/transformers/models/video_llava/configuration_video_llava.py
zucchini-nlp dba80e2
Update src/transformers/models/video_llava/configuration_video_llava.py
zucchini-nlp 6b3eafb
Merge remote-tracking branch 'upstream/main' into video_llava
zucchini-nlp ba4e125
add full example in docs
zucchini-nlp 6cc8af1
clean-up with new model-id
zucchini-nlp 885a5ae
[run-slow] video_llava
zucchini-nlp 377aafe
update docstring
zucchini-nlp 637b197
Merge branch 'main' into video_llava
zucchini-nlp a411347
[run-slow] video_llava
zucchini-nlp 0d83eaf
Merge branch 'huggingface:main' into video_llava
zucchini-nlp 8134039
remove all achive maps
zucchini-nlp 8e15514
fix some tests
zucchini-nlp 5d1e976
test was supposed to be skipped for llava :)
zucchini-nlp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
fix some tests
- Loading branch information
commit 8e15514e09bbbf6f076bafad05f98cabfd2db3ce
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it always be 8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, VideoLlava was trained and has to be used with 8 video frames. I will add it in the model docs page in "usage tips" section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we should validate this at the start of the validate call and raise an exception if the input isn't the correct shape
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this been added? Skimming I didn't spot but might have just missed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was added in
_get_vision_features()
, after that we can never know how many frames we have