VQGAN 模型的版本 #390

PhoebusSi · 2023-04-27T08:02:25Z

请问下OFA是用的那个版本的VQAGAN模型？可否上传下checpont和config.yaml文件或者提供下链接？

PhoebusSi · 2023-04-27T12:40:21Z

我用的你给的checkpoint zipfile image_gen_large_best.zip中的vqgan/last.ckpt和vqgan/model.yaml，但是这样对256x256编码成token sequence的长度是32x32=1024而不是文中说的16x16=256。请问是哪里的问题？

PhoebusSi · 2023-04-27T13:10:36Z

或者请问这里的code sequence（长度1024）对应的图片的resolution是多少？256吗？

logicwong · 2023-05-19T09:39:18Z

@PhoebusSi 直接对256x256编码那确实是1024长度。预训练时做的是image infilling，即还原图像中间部分的code，图像中部（128x128分辨率）编码出来的长度才是256

logicwong closed this as completed Sep 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VQGAN 模型的版本 #390

VQGAN 模型的版本 #390

PhoebusSi commented Apr 27, 2023

PhoebusSi commented Apr 27, 2023 •

edited

Loading

PhoebusSi commented Apr 27, 2023

logicwong commented May 19, 2023

VQGAN 模型的版本 #390

VQGAN 模型的版本 #390

Comments

PhoebusSi commented Apr 27, 2023

PhoebusSi commented Apr 27, 2023 • edited Loading

PhoebusSi commented Apr 27, 2023

logicwong commented May 19, 2023

PhoebusSi commented Apr 27, 2023 •

edited

Loading