Skip to content

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Oct 14, 2022

Our RandomShortestSize implementation on the references is developed with Object Detection in mind. Videos require a slight variation (see reference implementation for detection vs video).

This PR extends the transform in a BC-way so that it can support both. In particular, we make the max_size optional and this allows us to reparameterize the transform for videos as such:

from torchvision.prototype.transforms import RandomShortestSize
import math
import torch

x = torch.randn(7, 11, 3, 450, 800)
t = RandomShortestSize(list(range(256, 320+1)))
z = t(x)
print(z.shape)

size = min(z.shape[-2:])

_, t, c, h, w = x.shape
if w < h:
    new_h = int(math.floor((float(h) / w) * size))
    new_w = size
else:
    new_h = size
    new_w = int(math.floor((float(w) / h) * size))

print(new_h, new_w)

assert (new_h, new_w) == tuple(z.shape[-2:])

Though the names of min_size and max_size are confusing (better names would have been shortside_min_size_range and longside_max_size), their semantics align with the arguments that F.resize() has for size and and max_size. On the latter the default value of max_size (which again applies to the longest edge) is None, so this transform uses the same semantics and default values as in other places of the API.

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK to me, thanks @datumbox !

@datumbox datumbox merged commit 88b6b93 into pytorch:main Oct 14, 2022
@datumbox datumbox deleted the prototype/extend_random_shortest branch October 14, 2022 11:55
facebook-github-bot pushed a commit that referenced this pull request Oct 17, 2022
…r of the augmentation (#6770)

Summary:
* Extend RandomShortestSize to support Video specific flavour of the augmentation

* Adding a test.

* Apply changes from code review

Reviewed By: NicolasHug

Differential Revision: D40427454

fbshipit-source-id: ecd2ec17b047449c043b4c2f45b762c722cc5e04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants