Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VQGAN 模型的版本 #390

Closed
PhoebusSi opened this issue Apr 27, 2023 · 3 comments
Closed

VQGAN 模型的版本 #390

PhoebusSi opened this issue Apr 27, 2023 · 3 comments

Comments

@PhoebusSi
Copy link

请问下OFA是用的那个版本的VQAGAN模型?可否上传下checpont和config.yaml文件或者提供下链接?

@PhoebusSi
Copy link
Author

PhoebusSi commented Apr 27, 2023

我用的你给的checkpoint zipfile image_gen_large_best.zip中的vqgan/last.ckpt和vqgan/model.yaml,但是这样对256x256编码成token sequence的长度是32x32=1024而不是文中说的16x16=256。 请问是哪里的问题?

@PhoebusSi
Copy link
Author

或者请问这里的code sequence(长度1024)对应的图片的resolution是多少?256吗?
image

@logicwong
Copy link
Member

@PhoebusSi 直接对256x256编码那确实是1024长度。预训练时做的是image infilling,即还原图像中间部分的code,图像中部(128x128分辨率)编码出来的长度才是256

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants