-
Notifications
You must be signed in to change notification settings - Fork 26.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Video Llava #29733
Merged
Merged
Add Video Llava #29733
Changes from 1 commit
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
dce6678
add model draft
zucchini-nlp 72626df
update docstring
zucchini-nlp 8cca731
add tests
zucchini-nlp 4ea4f70
support image and video as input
zucchini-nlp c36819d
update for better handling of mixed input and clean-up a bit
zucchini-nlp c1a8fd5
bug when mixed inputs & add tests
zucchini-nlp c591c75
Update README.md
zucchini-nlp 5ff8d18
Merge remote-tracking branch 'upstream/main' into video_llava
zucchini-nlp a6bc68d
link to abstract of paper in README
zucchini-nlp eb309ed
fix test
zucchini-nlp 2f46f6c
fix-copies
zucchini-nlp 6b51b7e
Merge branch 'main' into video_llava
zucchini-nlp e112958
make tests happy
zucchini-nlp 5cb6163
skip docstest for now
zucchini-nlp 930147d
do not run doctest for now
zucchini-nlp 24ec2b3
Merge remote-tracking branch 'upstream/main' into video_llava
zucchini-nlp 142bfc0
Update src/transformers/models/video_llava/processing_video_llava.py
zucchini-nlp fdec895
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp e83251c
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp 4fcfe72
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp 327030d
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp 33289a5
Update tests/models/video_llava/test_modeling_video_llava.py
zucchini-nlp dfef75a
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp ebf1042
address review comments
zucchini-nlp aa1b278
failing tests
zucchini-nlp 7802922
Fix vocab_size in common tests for VLMs
zucchini-nlp 9fce414
codestyle
zucchini-nlp e8b4569
Merge branch 'huggingface:main' into video_llava
zucchini-nlp bb1cc26
Update src/transformers/models/video_llava/configuration_video_llava.py
zucchini-nlp e2e92b2
Update src/transformers/models/video_llava/configuration_video_llava.py
zucchini-nlp 5c77fff
Update src/transformers/models/video_llava/modeling_video_llava.py
zucchini-nlp 99518cb
Update src/transformers/models/video_llava/modeling_video_llava.py
zucchini-nlp 451fd72
Update docs/source/en/model_doc/video_llava.md
zucchini-nlp 95a9a01
Update docs/source/en/model_doc/video_llava.md
zucchini-nlp 347fa8c
Update src/transformers/models/video_llava/image_processing_video_lla…
zucchini-nlp 3e2f1b4
Update docs/source/en/model_doc/video_llava.md
zucchini-nlp 3cd1222
Update src/transformers/models/video_llava/processing_video_llava.py
zucchini-nlp 242703a
Update tests/models/video_llava/test_modeling_video_llava.py
zucchini-nlp 9c1a10d
Update tests/models/video_llava/test_modeling_video_llava.py
zucchini-nlp b4145e1
Update tests/models/video_llava/test_modeling_video_llava.py
zucchini-nlp 5803d5a
PR suggestions
zucchini-nlp 975d959
fix-copies
zucchini-nlp 7f30e3b
Merge branch 'main' into video_llava
zucchini-nlp 6bdad81
Merge branch 'huggingface:main' into video_llava
zucchini-nlp a817f31
Update src/transformers/models/video_llava/configuration_video_llava.py
zucchini-nlp dba80e2
Update src/transformers/models/video_llava/configuration_video_llava.py
zucchini-nlp 6b3eafb
Merge remote-tracking branch 'upstream/main' into video_llava
zucchini-nlp ba4e125
add full example in docs
zucchini-nlp 6cc8af1
clean-up with new model-id
zucchini-nlp 885a5ae
[run-slow] video_llava
zucchini-nlp 377aafe
update docstring
zucchini-nlp 637b197
Merge branch 'main' into video_llava
zucchini-nlp a411347
[run-slow] video_llava
zucchini-nlp 0d83eaf
Merge branch 'huggingface:main' into video_llava
zucchini-nlp 8134039
remove all achive maps
zucchini-nlp 8e15514
fix some tests
zucchini-nlp 5d1e976
test was supposed to be skipped for llava :)
zucchini-nlp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Update src/transformers/models/video_llava/image_processing_video_lla…
…va.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
- Loading branch information
commit 347fa8c6b146d38094f838c6a88fe35b618cf8e4
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add tests for the image processor - in particular to test that it correctly handles just images, just videos and image + video inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests, but there is one thing to note. If we call directly the ImageProcessor class, it requires and argument
images
to be present. A workaround is to pass explicitlyimages=None
for VideoLlavaImageProcessor, which I did for the tests.I can override call and to make the argument
images = None
. so that it is optional, but not sure how good is overriding call. Also, I do not think many ppl call image processor explicitly.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the image processor takes both
images
andvideos
as input, and only one of them is required, then settingimage = None
seems reasonable