You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
estimated flow is directly used to F.grid_sample function.
It should have no problem when grid is in the absolute coordinate, but actually grid is in relative coordinate, which left-top corner is [-1, -1] and right-bottom corner is [1, 1].
This introduces problematic behavior, because in the training condition, flow is estimated on (256, 256) patches, but in the testing condition, it is estimated on (1152, 1920) frames(regarding paddings), resulting much larger flow than expected.
A quick workaround is to apply some weightings regarding train-test size changes, like: sybahk@df138a9
And I ran evaluation using this command : python3 -m compressai.utils.video.eval_model pretrained $UVG_PATH outputs -a ssf2020 -q 1,2,3,4 -o ssf2020-mse-ans-vimeo-modified.json sybahk@b9f5610
Applying this workaround, model's R-D curve goes much higher than before, showing similar result with authors'. python3 -m compressai.utils.video.plot -f results/video/UVG-1080p/ssf* -o outputs/fig.png
(ssf2020-mse is the one that used the workaround.)
Hi,
I think there is an issue on the SSF model implementation which prevents model to get appropriate R-D result.
In the code,
CompressAI/compressai/models/video/google.py
Lines 354 to 371 in 743680b
It should have no problem when grid is in the absolute coordinate, but actually grid is in relative coordinate, which left-top corner is [-1, -1] and right-bottom corner is [1, 1].
This introduces problematic behavior, because in the training condition, flow is estimated on (256, 256) patches, but in the testing condition, it is estimated on (1152, 1920) frames(regarding paddings), resulting much larger flow than expected.
A quick workaround is to apply some weightings regarding train-test size changes, like:
sybahk@df138a9
And I ran evaluation using this command :
python3 -m compressai.utils.video.eval_model pretrained $UVG_PATH outputs -a ssf2020 -q 1,2,3,4 -o ssf2020-mse-ans-vimeo-modified.json
sybahk@b9f5610
Applying this workaround, model's R-D curve goes much higher than before, showing similar result with authors'.
python3 -m compressai.utils.video.plot -f results/video/UVG-1080p/ssf* -o outputs/fig.png
(ssf2020-mse is the one that used the workaround.)
When using pretrained model, applying the workaround is just fine, but when training a new model, I think we should consider input size from training time like DCVC does:
https://github.com/microsoft/DCVC/blob/4df94295c8dbe0a26456582d1a0eddb3465f1597/DCVC-TCM/src/models/video_net.py#L83-L94
The text was updated successfully, but these errors were encountered: