-
Notifications
You must be signed in to change notification settings - Fork 29.5k
[Fix] ViViT interpolate_pos_encoding #33815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5453304
fa76cc7
a409ffd
7e39bfc
be92a6e
732e4c3
e990d50
84e62ca
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -359,12 +359,12 @@ def test_inference_interpolate_pos_encoding(self): | |
# allowing to interpolate the pre-trained position embeddings in order to use | ||
# the model on higher resolutions. The DINO model by Facebook AI leverages this | ||
# to visualize self-attention on higher resolution images. | ||
model = VivitModel.from_pretrained("google/vivit-b-16x2").to(torch_device) | ||
model = VivitModel.from_pretrained("google/vivit-b-16x2-kinetics400").to(torch_device) | ||
|
||
image_processor = VivitImageProcessor.from_pretrained("google/vivit-b-16x2") | ||
image_processor = VivitImageProcessor.from_pretrained("google/vivit-b-16x2-kinetics400") | ||
video = prepare_video() | ||
inputs = image_processor( | ||
video, size={"shortest_edge": 480}, crop_size={"height": 480, "width": 480}, return_tensors="pt" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. crop_size option should still be included in the test, as this will force the interpolation There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sure, let me push the commit in a second There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done 👍 |
||
video, size={"shortest_edge": 480}, crop_size={"height": 232, "width": 232}, return_tensors="pt" | ||
) | ||
pixel_values = inputs.pixel_values.to(torch_device) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change the checkpoint and the crop_size in the test?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amyeroberts
PR Slow CI
that you triggered. (The one I mentioned above)crop_size
(s) leads to an error during the calling of interpolation method for example when thecrop_size
wascrop_size={"height": 480, "width": 480}
the following error occurs:same happens with some other crop sizes as well. But the error doesn't occur for
crop_size
like 232 or 228 or even the defaultcrop_size
224There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for explaining. The error shouldn't be triggered for the default
crop_size
value (no interpolation should happen) but if it works for these none default values then it's all good :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default
crop_size
value is 224, right?!, in all the image processing files. The error doesn't occur for that value tho. This value is only given in the test file so that the error doesn't occur & for the sake of testing a value apart from the default one.