You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Specify channel dim for transforms.Normalize, transforms.functional.normalize, transforms.functional_tensor.normalize, To enable transforms.Normalize to normalize according mean and std by specified channel.
A solution is adding a new argument dim_channel to the classes and functions above and
# in transforms.functional_tensor.normalizebroadcast_ch_shape= [1for_inrange(tensor.ndim)]
broadcast_ch_shape[dim_channel] =-1ifmean.ndim==1:
mean=mean.view(*broadcast_ch_shape)
ifstd.ndim==1:
std=std.view(*broadcast_ch_shape)
returntensor.sub_(mean).div_(std)
Motivation, pitch
Recent torchvision deprecated transforms._transforms_video and added features in many transforms to process [..., H, W] shaped tensors. For video transforming, it is a great improvement, meanwhile, transforms.Normalize is not lucky enough to be among these transforms. This means that the users either resort to other transforms such as pytorchvideo.transforms.Normalize or normalize each frame seperately. The requested feature will relieve this pain, and video transforms can be more nice and neat.
We need a bit more time to decide how we want to handle this. Right now we are in the middle of revamping the Transforms API to offer native support not only for Images but also Videos, Bounding Boxes, Masks, Labels etc. We plan to post soon a blogpost with the announcement but you can see some examples at #6753.
To make the long story short, the new Transforms API "stores" the videos in a [..., T, C, H, W] format. This allows us to very efficiently transform the video frames by reusing existing image kernels. We also offer transforms to permute/transpose the dimensions. The new API uses Tensor Subclassing to store meta-data along the standard tensor (things like colour space for example).
Offering an extra parameter on normalize kernel is possible but conflicts with the existing design. Having said that, in some limited cases, we've offered this new parameter to assist user migration. For example:
Given the above, shall we wait for the blogpost to be published (happy to give you a ping) and give you some time to review the design? After that, it would be great to get your input on whether the new API covers your needs or if you think we need enhancements. Let me know what you think. Thanks!
🚀 The feature
Specify channel dim for
transforms.Normalize
,transforms.functional.normalize
,transforms.functional_tensor.normalize
, To enabletransforms.Normalize
to normalize according mean and std by specified channel.A solution is adding a new argument
dim_channel
to the classes and functions above andMotivation, pitch
Recent torchvision deprecated
transforms._transforms_video
and added features in many transforms to process [..., H, W] shaped tensors. For video transforming, it is a great improvement, meanwhile,transforms.Normalize
is not lucky enough to be among these transforms. This means that the users either resort to other transforms such aspytorchvideo.transforms.Normalize
or normalize each frame seperately. The requested feature will relieve this pain, and video transforms can be more nice and neat.Alternatives
No response
Additional context
No response
cc @vfdev-5 @datumbox
The text was updated successfully, but these errors were encountered: