You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Training and computing p(x): f: x -> z --> Feed x into function to get z, then maximize log_prob(x) = log_prob(z) + log_det_jacobian(f(x)), where log_prob(z) is usually log likelihood of Normal distribution N(z; mean=0, std=1).
Sampling: f-1: z -> x --> Sample z from the prior distribution (i.e., Normal distribution), and then feed into the inverse function to get x.
TL;DR
Normalizing flows can compute exact p(x) and we can train them by MLE (i.e., assign high density to training data).
However, most of the time we fail to use p(x) to distinguish out-of-distribution (OOD) data.
MLE training has a limited influence on OOD detection: Model are only trained to assign high probability on training data, instead of assigning low density on OOD data.
I.e., flows are learned to generate data, this objective does not necessarily need to learn semantics. Instead, learning pixel correlations (i.e., nearby pixels have similar colors) will generate high quality images.
Whether data is in or out-of-distribution is mainly distinguished by their semantics (i.e., label y), not by their pixel correlations.
The inductive bias of Normalizing flows (mainly study the coupling layer based NNs): They learn pixel correlations instead of semantics, so that's why flows fail to detect OOD data.
If given image embeddings that pretrained with images and labels, flows can detect OOD successfully from image embeddings.
They study the intermediate output of affine coupling layers of flows by injecting different masks (e.g., checkerboard mask, horizontal mask, and their proposed cycle mask), and find that even the first two masks applied to intermediate layers, flows can still learn to predict pixels by their neighbors. However, with their proposed cycle mask mechanism, flows cannot easily predict pixels by their neighbors, thus achieve successful OOD detection. However, since neighbors cannot easily obtain to predict pixels, the generation quality is not good. (Tradeoff between OOD and high-quality image generation?)
The text was updated successfully, but these errors were encountered:
Metadata
Background: Training and sampling in Flows
TL;DR
The text was updated successfully, but these errors were encountered: