Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure 4 in paper may be wrong according to the code? #13

Open
XuanxuanGao opened this issue Jun 30, 2022 · 7 comments
Open

Figure 4 in paper may be wrong according to the code? #13

XuanxuanGao opened this issue Jun 30, 2022 · 7 comments

Comments

@XuanxuanGao
Copy link

According to the code, the Figure 4. Left Stage 1 in paper should drop the Linear Embedding and take the Patch Embedding inside?

Looking forward to your reply~

@CarpeDiemly
Copy link

我也发现了这个问题,代码里用了if else语句,所以在第一个阶段,patch embedding和linear embedding只能存在一个,而不像Fig.4里画的那样串行。

@mywebinfo65536
Copy link

Yes, I have the same question.

@drkostas
Copy link

I have the same question. It's even more confusing because in this code snippet:

if i == 0:
     patch_embed = Head(num_conv)
else:
     patch_embed = OverlapPatchEmbed(
                                    img_size=img_size if i == 0 else img_size // (2 ** (i + 1)),
                                    patch_size=7 if i == 0 else 3,
                                    stride=4 if i == 0 else 2,
                                    in_chans=in_chans if i == 0 else embed_dims[i - 1],
                                    embed_dim=embed_dims[i])

the individual if i == 0 else parts of the OverlapPatchEmbed definition are never used since this part is only called when i != 0.

@OliverRensu
Copy link
Owner

In the first stage, we use conv stem (Head) to replace naive non-overlap patch embedding in ViT. In the following stage, we use overlap patchembed.

@OliverRensu
Copy link
Owner

You can regard the Head is the combination of patch embedding and Linear Embedding

@drkostas
Copy link

Thanks for your reply. Since I can regard it as a combination of patch and linear embedding, is there a way to calculate the number of patches like how it's calculated in the OverlapPatchEmbed?

@mywebinfo65536
Copy link

yes, I also have the question that how can we get the patch number in your Head (of the Stage1) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants