RandomResizedCrop API Change #676

LukeWood · 2022-08-08T19:23:20Z

Currently, RandomResizedCrop is tuned by taking a crop_area_factor and aspect_ratio_factor.

The API should be updated to take a target_size, zoom_factor, and an aspect ratio factor. At augmentation time, a value is drawn randomly from the Zoom factor and aspect ratio factor distributions. a crop size is computed by multiplying each dimension of target size with the value drawn from zoom_factor. Next, the width and height of the crop size are distorted accordingly by the value drawn from the aspect ratio factor. A crop of this size is taken from the image, and finally is resized to the target size.

Some edge cases:

when the crop size is larger than the image, we want to still respect aspect ratio (random crop uses smart resize)

AdityaKane2001 · 2022-08-10T21:03:36Z

@LukeWood

My understanding is as follows:

Assume an input image of 300x400. (height x width)
First we sample a zoom factor (would be <1 in a normal case?) and take a crop of that size. Assuming zoom factor is 0.5, we will take a crop of 150x200.
Then a width and height are calculated and the image is resized to this new distorted size. Assuming the aspect ratio to be 0.5 (height : width), the new image will be 150x300 (or 100x200?).
Finally resize to a target size, say 224x224.

Is this procedure correct? If this is incorrect, could you please jot down the procedure in the same way as above to avoid any confusion?

/cc @sayakpaul

LukeWood · 2022-08-10T23:39:09Z

First we sample a zoom factor (would be <1 in a normal case?) and take a crop of that size. Assuming zoom factor is 0.5, we will take a crop of 150x200.

either less or greater than one. 150x200 zoomed by 2 should crop a 75x100 region, so it is zoomed to double the size

Then a width and height are calculated and the image is resized to this new distorted size. Assuming the aspect ratio to be 0.5 (height : width), the new image will be 150x300 (or 100x200?).

I guess to get the desired effect, zoom_factor => should be inversed before multiplying. so a zoom factor of 2 => 0.5 and vice versa. This means a zoom factor of 0.5 zooms out, 2.0 zooms in.

Finally resize to a target size, say 224x224.

yep! zoom_Factor 1.0 and aspect ratio 1.0 should just be the same as random crop basically.

Sound good?

AdityaKane2001 · 2022-08-11T06:26:15Z

Okay. Zoom factor is almost clear to me now.

Assuming we use a zoom factor of 0.5 (which will zoom out the image) we get a 600x800 image, right? How would the image look like? Will the image repeat itself or is it just a normal resize?

LukeWood · 2022-08-11T17:08:31Z

you will get a 600x800 crop, but then it will be resized BACK to 224x244. So zoom is computed based on crop size.

Imagine a 1000x1000 image with a crop size of 200x200. If you sample zoom factor 0.5, your crop size will be 400x400, then you will be resized back to 200x200.

LukeWood · 2022-08-12T17:10:46Z

An exact process:

target_size, zoom_factor, aspect_ratio_factor.

the process starts and a value is sampled from zoom_factor:

target_size = (50, 50)
zoom = zoom_factor() # lets use 0.5
aspect_ratio = aspect_ratio_factor() # lets use 9/10

next, crop size is calculated using target size and zoom:

crop_size = target_size / zoom # (100, 100)

next, aspect_ratio is applied:

crop_size = distort_aspect_ratio(crop_size, aspect_ratio)
#  (100/sqrt(9/10), 100*sqrt(9/10))
#  (94, 105) # I rounded these

take a crop of crop_size:

# we compute crop locations however needed
crop = tf.image.crop(x, y, crop_size)
result = tf.image.resize(crop, target_size)

AdityaKane2001 · 2022-08-12T17:28:39Z

@LukeWood

Thanks for this, it really helps!

I have two concerns regarding this approach:

I understand that this is quite close to the current implementation, in terms of the end result. The gist is the same - we crop a random area and resize it to the given target_size. Given this, I am not sure why do we need to include target_size in the calculation of crop_size. target_size should ideally have no influence over the crop_size.
The zoom_factor makes it a bit unintuitive IMO. zoom_factor being a positive float does not signify something interpretable. In the current API, crop_area_factor signifies the part of total area that is going to be cropped.

LukeWood · 2022-08-15T16:54:08Z

Thanks for this, it really helps!

I have two concerns regarding this approach:

I understand that this is quite close to the current implementation, in terms of the end result. The gist is the same - we crop a random area and resize it to the given target_size. Given this, I am not sure why do we need to include target_size in the calculation of crop_size. target_size should ideally have no influence over the crop_size.

Any reason why not? If the goal of the layer is to take zooms with a level of distortion, it should be easy to tune the distortions relative to the result.

The zoom_factor makes it a bit unintuitive IMO. zoom_factor being a positive float does not signify something interpretable. In the current API, crop_area_factor signifies the part of total area that is going to be cropped.

The nice thing about zoom_factor is that you can pass something like: 1.0 and reasonable reason that it has NO zoom, and you can tune it up or down in incremental amounts. With area_factor there is NO way to ensure that you will have no zoom, which makes it incredibly hard to reason about the result the layer will have on your preprocessing pipeline

LukeWood · 2022-08-15T17:05:16Z

@martin-gorner has more thoughts on this too

martin-gorner · 2022-08-16T14:47:53Z

Toughts:

Please make sure the min and max zoom values have the same meaning as in Model Garden's implementation where they are called aug_scale_min and aug_scale_max
An edge case not yet considered is what happens when the computed crop_size ends up being larger, in at least one dimension, than the image. For example, if the image is 1024 x 512 pixels and the computed crop_size is 700x700 ? One possibility is to add black borders, another is to use the layers.Resizing(crop_to_aspect_ratio=True) algorithm, i.e. cut the biggest part of the image that has the same aspect ratio as crop_size. This would not quite respect the zoom factor in exchange of not having black bars. (Care must be taken to retain location randomness though: even if there is only one way to squeeze and maximally fit a 700x700 square along the 512px dimension, there are multiple possible locations along the 1024px dimension of the original image!)

I think I prefer the solution without black bars for two reasons:

black bars can introduce unwanted training effects: if a dataset has one class represented more often by elongated images, which are more likely to produce black bars when RandomResizeCropped, the model can start detecting black pixels as a proxy for this class.
RandomResizeCrop should become RandomCrop with no zooming and no distorsions, and RandomCrop never produces black bars (but please check if RandomCrop does respect location randomness in the edge case ?)

sayakpaul · 2022-08-16T14:55:53Z

black bars can introduce unwanted training effects: if a dataset has one class represented more often by elongated images, which are more likely to produce black bars when RandomResizeCropped, the model can start detecting black pixels as a proxy for this class.

I like this thought experiment. Tremendous one.

LukeWood · 2022-08-17T22:20:12Z

@AdityaKane2001 is this clear now?

AdityaKane2001 · 2022-08-18T18:05:17Z

@LukeWood Yup, got it. I'll create a PR for this over the weekend.

AdityaKane2001 · 2022-08-23T07:29:13Z

@martin-gorner @LukeWood @sayakpaul

An edge case not yet considered is what happens when the computed crop_size ends up being larger, in at least one dimension, than the image.

To avoid this case, I'll just clip the crop_size values to the image dimensions. It seems to have the same effect as layers.Resizing(crop_to_aspect_ratio=True). Moreover it will intuitively handle the case where both dimensions of crop_size are larger than the image dimensions. WDYT?

martin-gorner · 2022-08-23T09:05:52Z

you cannot clip crop_resize without introducing image distorsions. Remember that the final image size is fixed for all images which also means a fixed aspect ratio. If you clip crop_resize at any point, you change the aspect ratio. The only way to get it back is to distort the image. When the user specifies "no distorsinos", there should be no distorsions.

…

On Tue, 23 Aug 2022 at 09:29, Aditya Kane ***@***.***> wrote: @martin-gorner <https://github.com/martin-gorner> @LukeWood <https://github.com/LukeWood> @sayakpaul <https://github.com/sayakpaul> An edge case not yet considered is what happens when the computed crop_size ends up being larger, in at least one dimension, than the image. To avoid this case, I'll just clip the crop_size values to the image dimensions. It seems to have the same effect as layers.Resizing(crop_to_aspect_ratio=True). Moreover it will intuitively handle the case where both dimensions of crop_size are larger than the image dimensions. WDYT? — Reply to this email directly, view it on GitHub <#676 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHKKZ7HIH744C77PT7V3SLV2R4VLANCNFSM556HBLCQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Martin Görner | Product Manager, TensorFlow & Keras | ***@***.*** | +1 425 273 0605

AdityaKane2001 · 2022-08-23T10:04:12Z

@martin-gorner

Understood.

@LukeWood @sayakpaul

I guess I will rewrite the implementation and drop tf.image.crop_and_resize as it does not have this functionality.

martin-gorner · 2022-08-24T17:14:57Z

I think tf.image.crop_and_resize should work fine. You just have to pass it the correct crop boxes.

AdityaKane2001 mentioned this issue Aug 29, 2022

Split RandomlyResizedCrop into two API surfaces (RandomlyZoomedCrop, RandomCropAndResize) #738

Merged

LukeWood closed this as completed in #738 Sep 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RandomResizedCrop API Change #676

RandomResizedCrop API Change #676

LukeWood commented Aug 8, 2022 •

edited

Loading

AdityaKane2001 commented Aug 10, 2022 •

edited

Loading

LukeWood commented Aug 10, 2022

AdityaKane2001 commented Aug 11, 2022 •

edited

Loading

LukeWood commented Aug 11, 2022

LukeWood commented Aug 12, 2022 •

edited

Loading

AdityaKane2001 commented Aug 12, 2022

LukeWood commented Aug 15, 2022

LukeWood commented Aug 15, 2022

martin-gorner commented Aug 16, 2022

sayakpaul commented Aug 16, 2022

LukeWood commented Aug 17, 2022

AdityaKane2001 commented Aug 18, 2022

AdityaKane2001 commented Aug 23, 2022

martin-gorner commented Aug 23, 2022 via email

AdityaKane2001 commented Aug 23, 2022

martin-gorner commented Aug 24, 2022

RandomResizedCrop API Change #676

RandomResizedCrop API Change #676

Comments

LukeWood commented Aug 8, 2022 • edited Loading

AdityaKane2001 commented Aug 10, 2022 • edited Loading

LukeWood commented Aug 10, 2022

AdityaKane2001 commented Aug 11, 2022 • edited Loading

LukeWood commented Aug 11, 2022

LukeWood commented Aug 12, 2022 • edited Loading

AdityaKane2001 commented Aug 12, 2022

LukeWood commented Aug 15, 2022

LukeWood commented Aug 15, 2022

martin-gorner commented Aug 16, 2022

sayakpaul commented Aug 16, 2022

LukeWood commented Aug 17, 2022

AdityaKane2001 commented Aug 18, 2022

AdityaKane2001 commented Aug 23, 2022

martin-gorner commented Aug 23, 2022 via email

AdityaKane2001 commented Aug 23, 2022

martin-gorner commented Aug 24, 2022

LukeWood commented Aug 8, 2022 •

edited

Loading

AdityaKane2001 commented Aug 10, 2022 •

edited

Loading

AdityaKane2001 commented Aug 11, 2022 •

edited

Loading

LukeWood commented Aug 12, 2022 •

edited

Loading