Implement vectorized base image augmentation layer w/ `Grayscale()` example #1331

LukeWood · 2023-01-30T21:32:16Z

We have a ton of performance issues with KPLs due to auto vectorization not working with tf.image ops.

Basically my proposed approach is as follows:

replace KPLs that can be vectorized fully with VectorizedBaseImageAugmentation subclasses

Fixes: #581

Caveats of this approach:

a bit less flexibility (i.e. in the case image augmentation could be vectorized, but bounding box augmentation could be not)
loss of ragged support
perhaps a bit convoluted implementation wise to have two base layers

bhack · 2023-01-30T23:15:38Z

What about the product decision at tensorflow/tensorflow#55639 (comment)?

LukeWood · 2023-01-30T23:17:42Z

What about the product decision at tensorflow/tensorflow#55639 (comment)?

Curious @fchollet 's thoughts on this. Its impossible to vectorize this one in TensorFlow and keep transformations independent from one another. Do you think a slowdown is acceptable if this is the case?

bhack · 2023-01-30T23:45:52Z

The bottleneck is with any API that doesn't have a batch dimension on param that you want to randomize or change x element.

So it is like if you need to loop over the batch calling the same op with a different parameter.

I don't know if this could happen or not in the near future with the new XLA/MLIR birdge design and its genenerated kernels but right now not with vectorized map and I think not with the current bridge.

Edit (our old thread about tf.image.* lowerings`):
https://discuss.tensorflow.org/t/lowering-for-tf-raw-ops-imageprojectivetransform/8462

bhack · 2023-01-31T01:40:34Z

tensorflow/tensorflow#55335 (comment)

@jpienaar Can ops like this one be fused in a loop without breaking the raw ops API?

keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer.py

keras_cv/layers/preprocessing/random_contrast.py

LukeWood · 2023-02-07T20:00:49Z

@jbischof @ianstenbit this is ready for a real review now. We can incrementally migrate layers to vectorized layers using this approach.

bhack · 2023-02-07T20:45:27Z

Before we merge I hope that we could clarify the point about the product decision at:

#1256 (comment)

As It was since the first keras-cv commits that I was talking about the performance overhead of this design with tf.image API.

LukeWood · 2023-02-07T21:43:27Z

Before we merge I hope that we could clarify the point about the product decision at:

#1256 (comment)

As It was since the first keras-cv commits that I was talking about the performance overhead of this design with tf.image API.

What are we hoping to have clarified? This change solves the performance issues of grayscale with no regression of functionality. Item-wise augmentation is still deemed important.

LukeWood · 2023-02-07T22:20:31Z

/gcbrun

LukeWood · 2023-02-07T22:21:59Z

/gcbrun

bhack · 2023-02-07T23:49:16Z

What are we hoping to have clarified? This change solves the performance issues of grayscale with no regression of

It is not only about grayscale as this is related to many ops. An August not update list is at #291 (comment)

LukeWood · 2023-02-08T05:18:13Z

What are we hoping to have clarified? This change solves the performance issues of grayscale with no regression of

It is not only about grayscale as this is related to many ops. An August not update list is at #291 (comment)

My goal in this PR is to provide an abstraction to allow us to optimize the performance of layers that are not being converted by pfor: #291 (comment)

Does that answer the question?

LukeWood · 2023-02-08T05:23:46Z

/gcbrun

bhack · 2023-02-08T09:20:17Z

Does that answer the question?

This was the same topic one year ago but we didn't care about the topic also if we was in a design phase:
#455 (comment)

So what have changed in 1 year that instead we could not care in the design phase with all the explicit feedback and benchmarks that I've sent?

not being converted by pfor

As you can read in many place and also in that comment it is not the only point for the failure/fallback of pfor conversion as we have also missing converters.

As an extra note:
after all these months it seems that we don't have tf.image API owners and probably also pfor API owner as all the related tickets are without any other comment or action.

LukeWood · 2023-02-08T18:25:41Z

Does that answer the question?

This was the same topic one year ago but we didn't care about the topic also if we was in a design phase: #455 (comment)

So what have changed in 1 year that instead we could not care in the design phase with all the explicit feedback and benchmarks that I've sent?

not being converted by pfor

As you can read in many place and also in that comment it is not the only point for the failure/fallback of pfor conversion as we have also missing converters.

As an extra note: after all these months it seems that we don't have tf.image API owners and probably also pfor API owner as all the related tickets are without any other comment or action.

we do care about performance, and as you say there is no tf.image API owner or pfor API owner, so instead of for these issues to be fixed we'd like to offer vectorized implementations of these layers in the meantime. Does that answer your question?

And to be clear, it was never that we didn't care about performance, it was simply that getting RandAugment working, OD augmentations working, and other product goals took priority. Now that these are working I'm attempting to revisit these layers and optimize without major regressions in functionality.

bhack · 2023-02-08T18:39:44Z

it was never that we didn't care about performance

Honestly the impression for one year was that we did not care about performance. I have not collected any comment that we recognized the problem also if it was submitted and benchmarked in multiple tickets and that we just need to postpone the design for resource constrain.

In any case better later then never, we will see the residual fallback after this migration.

About the PR:
Do we need to subclass tf.keras.__internal__.layers.BaseRandomLayer for every new policy?

ianstenbit

Thanks Luke!

Lots of comments because this is a foundational piece and I want to make sure we get it really right.

keras_cv/layers/preprocessing/batched_base_image_augmentation_layer.py

keras_cv/layers/preprocessing/batched_base_image_augmentation_layer_test.py

LukeWood · 2023-02-08T23:02:09Z

keras_cv/layers/preprocessing/batched_base_image_augmentation_layer.py

+            inputs = {IMAGES: inputs}
+
+        metadata[BATCHED] = inputs["images"].shape.rank == 4
+        if inputs["images"].shape.rank == 3:


@ianstenbit see here, we expand dims if its operation in unbatched mode.

LukeWood · 2023-02-08T23:11:09Z

@ianstenbit all comments addressed.

Closed in favor of #1373

LukeWood · 2023-02-08T23:13:02Z

it was never that we didn't care about performance

Honestly the impression for one year was that we did not care about performance. I have not collected any comment that we recognized the problem also if it was submitted and benchmarked in multiple tickets and that we just need to postpone the design for resource constrain.

In any case better later then never, we will see the residual fallback after this migration.

About the PR: Do we need to subclass tf.keras.__internal__.layers.BaseRandomLayer for every new policy?

Honestly the impression for one year was that we did not care about performance. I have not collected any comment that we recognized the problem also if it was submitted and benchmarked in multiple tickets and that we just need to postpone the design for resource constrain.

Resource constraints are real and we were hoping pfor would either be:

fixed
sufficiently fast

Neither turned out so we are now prioritizing this. Please keep in mind resource constraints are a real issue, and they don't mean issues aren't important; they simply mean we have other things that are more pressing at the time.

bhack · 2023-02-08T23:32:36Z

Neither turned out so we are now prioritizing this. Please keep in mind resource constraints are a real issue, and they don't mean issues aren't important; they simply mean we have other things that are more pressing at the time.

Generally I agree but it is a little bit different when we was still in the design phase and we had realtime feedback about how it will go to fail.
If not it will be hard to interact clearly on the same page (mainly on PR) if we give priority/attention to f2f meeting.

Pfor could be not fixed if the underline design API of the ops we want to use in this library is not batch ready.

I think our approach was really out of design for Pfor from the early days.

Then for missing converters it could be still fixed.

We cared about something (within the batch policy) with some theoretical validity but without an empirical verification about the faster convergence vs x epoch computational overhead.

Also as we have seen in another recent PR pytorch vision is not working within the batch so many reference models that we also train/replicare and the related reference papers were not trained with the within the batch policy at all.

LukeWood · 2023-02-09T04:18:40Z

We cared about something (within the batch policy) with some theoretical validity but without an empirical verification about the faster convergence vs x epoch computational overhead.

This already matters in contrastive learning and any case where samples are compared within a batch. This is a case where a silent error could really waste a users time someday - and we should avoid this if possible.

In the current structure we will likely be able to have both performance and numerical correctness. I think this is the best of all worlds

bhack · 2023-02-09T11:14:51Z

This already matters in contrastive learning and any case where samples are compared within a batch. This is a case where a silent error could really waste a users time someday - and we should avoid this if possible.

In the current structure we will likely be able to have both performance and numerical correctness. I think this is the best of all worlds

Yes but:

We had only one contrastive network so by the priority/resource constrain point of view there is something wrong about what API to prioritize
They never thought to fallback on vmap cause they known the design of the tf.image API. If you check again they have constrained the ops on CPU to try at least to use (large?) multicore machines for the preprocessing part of the computation:
https://github.com/tensorflow/similarity/blob/master/tensorflow_similarity/augmenters/simclr.py#L142:L143

bhack mentioned this pull request Jan 30, 2023

tf.vectorized_maps resolve fallbacks tensorflow/tensorflow#55639

Open

LukeWood requested review from ianstenbit and jbischof January 30, 2023 22:49

ianstenbit suggested changes Feb 1, 2023

View reviewed changes

LukeWood added 7 commits February 2, 2023 10:51

implement vectorized base image augmentation layer

9851cbb

Implement vectorized RandomContrast layer

45d7627

KPL performance

64abfde

Random contrast vectorized

9093485

Vectorized contrast

64fdf5e

Fix vectorized base layer

b2fd41a

Fix vectorized base layer

5a9580b

LukeWood force-pushed the kplperf branch from b2624cf to 5a9580b Compare February 2, 2023 18:51

LukeWood added 2 commits February 7, 2023 11:39

Add vectorized grayscale layer

46a4732

Remove random contrast

00fd463

LukeWood changed the title ~~(Draft) Implement vectorized base image augmentation layer~~ Implement vectorized base image augmentation layer w/ Grayscale() example Feb 7, 2023

Remove random contrast

ecc5bed

LukeWood requested a review from ianstenbit February 7, 2023 20:00

test_preserves_ragged_status_Grayscale

02f9009

test_preserves_ragged_status_Grayscale

e83bbd4

Fix

d507bd2

Fix masks

1f083ee

Fix masks

f01016f

LukeWood added 4 commits February 8, 2023 10:27

rename to 'batched'

88e75e9

Fix docstrings

2ad1131

Fix docstrings

a675a32

Remove ragged method

92d31d2

ianstenbit suggested changes Feb 8, 2023

View reviewed changes

LukeWood commented Feb 8, 2023

View reviewed changes

LukeWood closed this Feb 8, 2023

LukeWood deleted the kplperf branch February 9, 2023 04:18

bhack mentioned this pull request Feb 9, 2023

Update the API design on vectorization #1385

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement vectorized base image augmentation layer w/ `Grayscale()` example #1331

Implement vectorized base image augmentation layer w/ `Grayscale()` example #1331

LukeWood commented Jan 30, 2023 •

edited

Loading

bhack commented Jan 30, 2023

LukeWood commented Jan 30, 2023

bhack commented Jan 30, 2023 •

edited

Loading

bhack commented Jan 31, 2023

LukeWood commented Feb 7, 2023

bhack commented Feb 7, 2023

LukeWood commented Feb 7, 2023

LukeWood commented Feb 7, 2023

LukeWood commented Feb 7, 2023

bhack commented Feb 7, 2023 •

edited

Loading

LukeWood commented Feb 8, 2023

LukeWood commented Feb 8, 2023

bhack commented Feb 8, 2023 •

edited

Loading

LukeWood commented Feb 8, 2023

bhack commented Feb 8, 2023 •

edited

Loading

ianstenbit left a comment

LukeWood Feb 8, 2023

LukeWood commented Feb 8, 2023 •

edited

Loading

LukeWood commented Feb 8, 2023

bhack commented Feb 8, 2023

LukeWood commented Feb 9, 2023

bhack commented Feb 9, 2023

Implement vectorized base image augmentation layer w/ Grayscale() example #1331

Implement vectorized base image augmentation layer w/ Grayscale() example #1331

Conversation

LukeWood commented Jan 30, 2023 • edited Loading

bhack commented Jan 30, 2023

LukeWood commented Jan 30, 2023

bhack commented Jan 30, 2023 • edited Loading

bhack commented Jan 31, 2023

LukeWood commented Feb 7, 2023

bhack commented Feb 7, 2023

LukeWood commented Feb 7, 2023

LukeWood commented Feb 7, 2023

LukeWood commented Feb 7, 2023

bhack commented Feb 7, 2023 • edited Loading

LukeWood commented Feb 8, 2023

LukeWood commented Feb 8, 2023

bhack commented Feb 8, 2023 • edited Loading

LukeWood commented Feb 8, 2023

bhack commented Feb 8, 2023 • edited Loading

ianstenbit left a comment

Choose a reason for hiding this comment

LukeWood Feb 8, 2023

Choose a reason for hiding this comment

LukeWood commented Feb 8, 2023 • edited Loading

LukeWood commented Feb 8, 2023

bhack commented Feb 8, 2023

LukeWood commented Feb 9, 2023

bhack commented Feb 9, 2023

Implement vectorized base image augmentation layer w/ `Grayscale()` example #1331

Implement vectorized base image augmentation layer w/ `Grayscale()` example #1331

LukeWood commented Jan 30, 2023 •

edited

Loading

bhack commented Jan 30, 2023 •

edited

Loading

bhack commented Feb 7, 2023 •

edited

Loading

bhack commented Feb 8, 2023 •

edited

Loading

bhack commented Feb 8, 2023 •

edited

Loading

LukeWood commented Feb 8, 2023 •

edited

Loading