Update the API design on vectorization #1385

bhack · 2023-02-09T14:24:47Z

https://github.com/keras-team/keras-cv/blob/master/.github/API_DESIGN.md#vectorization

We need to updated that section of the documentation (#1331 (comment))

/cc @LukeWood

LukeWood · 2023-02-09T15:14:29Z

Thanks @bhack ! good catch!

bhack · 2023-02-09T18:20:36Z

Also it seems that we were not really using the claimed within the batch augmentation or not #1382 (comment)?

bhack · 2023-02-09T19:26:35Z

Yes I think it was basically wrong:

import tensorflow as tf

def test_map(input):
  print("Call to test")
  return tf.constant([1])

def test_vectorized_map(input):
  print("Call to test vectorized")
  return tf.constant([1])

input_shape = (5, 2, 2, 1)
batch = tf.random.uniform(input_shape)
input = {"input": batch}
c = tf.map_fn(test_map, input, fn_output_signature=tf.int32)
print(c)
print("-----------------")
c = tf.vectorized_map(test_vectorized_map, input)
print(c)

Call to test
Call to test
Call to test
Call to test
Call to test
tf.Tensor(
[[1]
 [1]
 [1]
 [1]
 [1]], shape=(5, 1), dtype=int32)
-----------------
Call to test vectorized
tf.Tensor(
[[1]
 [1]
 [1]
 [1]
 [1]], shape=(5, 1), dtype=int32)

sebastian-sz · 2023-02-09T19:52:39Z

@bhack I'm not sure. Isn't tf.vectorized_map running some tf.function internally that makes the python side effect (print) execute only once?

There were tf.function retracing issues as I remember: #241

Let's consider a different test: two identical images, wrapped in a batch and passed to the current implementation of RandomContrast.

If the factor() is sampled per batch: The entire batch is processed in the same way = both output images should be equal.
If the factor is sampled per image in the batch: each input will be distorted differently.

import tensorflow as tf
from keras_cv.layers import RandomContrast

image = tf.random.uniform((224, 224, 3))
batch = tf.stack([image, image])  # (2, 224, 224, 3)
tf.debugging.assert_near(batch[0], batch[1])  # OK

l = RandomContrast(0.75)
outputs = l(batch)
tf.debugging.assert_near(outputs[0], outputs[1]) # InvalidArgumentError

The second assert will raise exception, indicating that the same image, in a single batch received different augmentation.

bhack · 2023-02-09T20:11:40Z

Ok so probabably it is internally wrapped in graph mode in tf.vectorized_map. We are never fully in eager mode with vectorized_map.

import tensorflow as tf

def test_map(input):
  return input["input"]+tf.random.uniform([])

def test_vectorized_map(input):
  return input["input"]+tf.random.uniform([])


input_shape = (5, 1,)
batch = tf.zeros(input_shape)
input = {"input": batch}
c = tf.map_fn(test_map, input, fn_output_signature=tf.float32)
print(c)
print("-----------------")
c = tf.vectorized_map(test_vectorized_map, input)
print(c)

tf.Tensor(
[[0.17567658]
 [0.90278184]
 [0.5280751 ]
 [0.8226601 ]
 [0.7671194 ]], shape=(5, 1), dtype=float32)
-----------------
tf.Tensor(
[[0.9258783 ]
 [0.39560175]
 [0.8852272 ]
 [0.09358847]
 [0.03888452]], shape=(5, 1), dtype=float32)

LukeWood · 2023-02-13T04:59:09Z

The images are augmented sample wise - replace print with a tf.print() and you’ll get your expected result.

bhack · 2023-02-13T08:50:20Z

replace print with a tf.print() and you’ll get your expected result.

No you will not get the expected result with a tf.print. This was the point of the thread.

But the random op call inside the graph node is valid.

Check yourself with this gist:

import tensorflow as tf

def test_map(input):
  print("Standard function")
  return input["input"]+tf.random.uniform([])

def test_vectorized_map(input):  
  tf.print("Vectorized function")
  return input["input"]+tf.random.uniform([])

input_shape = (5, 1,)
batch = tf.zeros(input_shape)
input = {"input": batch}
c = tf.map_fn(test_map, input, fn_output_signature=tf.float32)
print(c)
print("-----------------")
c = tf.vectorized_map(test_vectorized_map, input)
print(c)

Standard function
Standard function
Standard function
Standard function
Standard function
tf.Tensor(
[[0.6856414 ]
 [0.3985964 ]
 [0.12619197]
 [0.5390271 ]
 [0.23261154]], shape=(5, 1), dtype=float32)
-----------------
Vectorized function
tf.Tensor(
[[0.5201206 ]
 [0.7133293 ]
 [0.45491397]
 [0.55868816]
 [0.9193896 ]], shape=(5, 1), dtype=float32)

bhack changed the title ~~Update the API design on vectorizzation~~ Update the API design on vectorization Feb 9, 2023

LukeWood self-assigned this Feb 9, 2023

sachinprasadhs unassigned LukeWood Apr 23, 2024

sachinprasadhs added the type:docs Improvements or additions to documentation label Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the API design on vectorization #1385

Update the API design on vectorization #1385

bhack commented Feb 9, 2023

LukeWood commented Feb 9, 2023

bhack commented Feb 9, 2023 •

edited

Loading

bhack commented Feb 9, 2023

sebastian-sz commented Feb 9, 2023

bhack commented Feb 9, 2023 •

edited

Loading

LukeWood commented Feb 13, 2023

bhack commented Feb 13, 2023 •

edited

Loading

Update the API design on vectorization #1385

Update the API design on vectorization #1385

Comments

bhack commented Feb 9, 2023

LukeWood commented Feb 9, 2023

bhack commented Feb 9, 2023 • edited Loading

bhack commented Feb 9, 2023

sebastian-sz commented Feb 9, 2023

bhack commented Feb 9, 2023 • edited Loading

LukeWood commented Feb 13, 2023

bhack commented Feb 13, 2023 • edited Loading

bhack commented Feb 9, 2023 •

edited

Loading

bhack commented Feb 9, 2023 •

edited

Loading

bhack commented Feb 13, 2023 •

edited

Loading