-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improper normalization of the scores? #24
Comments
@marco-rudolph hmm, I think you are right that max is improper for multi-scale case if we cannot use any statistics. In practice, we probably know past statistics and can assume max. |
Hello guys. I am actually working in a open source implementation of FastFlow which does what @marco-rudolph says FastFlow uses ResNet as encoder, and its maps from layers [1, 2, 3]. Attending to the paper, it says
You can check my implementation, I'm doing a big effort to have a open source solution as I am new on normalizing flows. I am failing when need to
I really would appreciate feedback |
@mjack3 they just mean that they sum log-likelihoods for each element in a latent vector, which is a simplification everyone makes i.e. we assume that each dimension is independent of others. Not sure why you square likelihood. |
Distribution variable has the output of the 3 NFlows for layer 1, 2 and 3 of resnet. This are 3 vector of (256,64,64) (512,32,32) and (1024,16,16). I square the log-likelihood similar as you or @marco-rudolph does. Am I wrong? In case, what could I do or what reference could you suggest? I am still learning about normalizing flow. Thanks to much |
|
I think you mix up likelihoods, which are based on the sum of squares of the output, and the output itself. |
@marco-rudolph I didn't experiment by myself, but I'd check 3 cases: 1) removing max, 2) getting max from train data, and 3) current max from test data. May be 1) << 3), but not sure about 2) vs. 3) because we just want to normalize scores to align between scales. |
@gudovskiy any news about that? |
@alevangel well, you can replace test with train |
In train.py, you normalize the scores according to:
This normalization is fine as long as it is done for only one map since this normalization function is monotonically increasing. By adding up the maps from the different layers, this makes no sense to me since the relative weighting of the score maps for aggregation (last line) depends on the test set or to be more precise on the maxima of the individual maps over the test set. Am I missing something here or is this normalization improper?
The text was updated successfully, but these errors were encountered: