Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Video Vision Transformer implementation #62

Merged
merged 17 commits into from
Feb 1, 2022

Conversation

abhi-glitchhg
Copy link
Member

@abhi-glitchhg abhi-glitchhg commented Jan 15, 2022

#26

@abhi-glitchhg abhi-glitchhg marked this pull request as draft January 15, 2022 11:01
@codecov-commenter
Copy link

codecov-commenter commented Jan 15, 2022

Codecov Report

Merging #62 (2692f27) into main (e968171) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main       #62   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           59        61    +2     
  Lines         1696      1749   +53     
=========================================
+ Hits          1696      1749   +53     
Impacted Files Coverage Δ
vformer/models/classification/cross.py 100.00% <ø> (ø)
vformer/models/classification/vanilla.py 100.00% <ø> (ø)
vformer/common/base_model.py 100.00% <100.00%> (ø)
vformer/encoder/embedding/__init__.py 100.00% <100.00%> (ø)
...former/encoder/embedding/video_patch_embeddings.py 100.00% <100.00%> (ø)
vformer/models/classification/__init__.py 100.00% <100.00%> (ø)
vformer/models/classification/vivit.py 100.00% <100.00%> (ø)

@NeelayS NeelayS changed the title Add Video Vision Transformer Add Video Vision Transformer implementation Jan 15, 2022
@NeelayS NeelayS marked this pull request as ready for review February 1, 2022 04:52


def test_TubeletEmbedding():
from vformer.encoder.embedding.video_patch_embeddings import TubeletEmbedding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move this import to the top of the file since we've been doing that for the rest of the test functions?

@@ -496,3 +496,26 @@ def test_ConvVT():
out = model(img2)
assert out.shape == torch.Size([4, 1000])
del model


def test_Vivit():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_Vivit():
def test_ViViT():

@NeelayS
Copy link
Member

NeelayS commented Feb 1, 2022

Hi @abhi-glitchhg, I have added a couple of suggestions.
Apart from those, could you please -

  • Delete the emtpy encoder/vivit.py file?
  • Add doc and README entries for ViViT?

Thanks.

@NeelayS NeelayS linked an issue Feb 1, 2022 that may be closed by this pull request
@NeelayS NeelayS merged commit 36102f3 into SforAiDl:main Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Paper] ViViT: A Video Vision Transformer
3 participants