Skip to content

A question about image crispness with VQ-VAE2 #43

Open
@francoisruty

Description

@francoisruty

Hello, first of all thanks for this interesting implementation of VQ-VAE 2 paper.

I can train this network on a dataset of mine, however reconstructed images are a little bit blurry. Quality is good overall, but crispness is nowhere near what is published in the paper.

My understanding of image blurriness with VAE in general is that it is caused by information loss (due to the information bottleneck), and the use of MSE loss (which averages the error over the image).

To me the VQ VAE2 paper, compared to a classic VAE, brings 2 new concepts: hierarchical latent maps, and quantization.
However I don't see why those 2 innovations would solve the classic VAE reconstruction blurriness problem.

This implementation performs just as I expected, which means it's good, but reconstructed images are not sharp like the ones in the paper.
This is not a problem with this implementation I think. The original VQVAE2 paper does not explain or describe why their architecture would yield sharp reconstructed images. I've read the paper, and it's just not there.

Or maybe the sharp images are possible only by sampling the trained prior by pixelSnail after stage 2?
But to me increased sharpness cannot come from the VAE alone

What do you guys think? Am I missing something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions