Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the API design on vectorization #1385

Open
bhack opened this issue Feb 9, 2023 · 7 comments
Open

Update the API design on vectorization #1385

bhack opened this issue Feb 9, 2023 · 7 comments
Labels
type:docs Improvements or additions to documentation

Comments

@bhack
Copy link
Contributor

bhack commented Feb 9, 2023

https://github.com/keras-team/keras-cv/blob/master/.github/API_DESIGN.md#vectorization

We need to updated that section of the documentation (#1331 (comment))

/cc @LukeWood

@bhack bhack changed the title Update the API design on vectorizzation Update the API design on vectorization Feb 9, 2023
@LukeWood LukeWood self-assigned this Feb 9, 2023
@LukeWood
Copy link
Contributor

LukeWood commented Feb 9, 2023

Thanks @bhack ! good catch!

@bhack
Copy link
Contributor Author

bhack commented Feb 9, 2023

Also it seems that we were not really using the claimed within the batch augmentation or not #1382 (comment)?

@bhack
Copy link
Contributor Author

bhack commented Feb 9, 2023

Yes I think it was basically wrong:

import tensorflow as tf

def test_map(input):
  print("Call to test")
  return tf.constant([1])

def test_vectorized_map(input):
  print("Call to test vectorized")
  return tf.constant([1])

input_shape = (5, 2, 2, 1)
batch = tf.random.uniform(input_shape)
input = {"input": batch}
c = tf.map_fn(test_map, input, fn_output_signature=tf.int32)
print(c)
print("-----------------")
c = tf.vectorized_map(test_vectorized_map, input)
print(c)
Call to test
Call to test
Call to test
Call to test
Call to test
tf.Tensor(
[[1]
 [1]
 [1]
 [1]
 [1]], shape=(5, 1), dtype=int32)
-----------------
Call to test vectorized
tf.Tensor(
[[1]
 [1]
 [1]
 [1]
 [1]], shape=(5, 1), dtype=int32)

@sebastian-sz
Copy link
Contributor

@bhack I'm not sure. Isn't tf.vectorized_map running some tf.function internally that makes the python side effect (print) execute only once?

There were tf.function retracing issues as I remember: #241

Let's consider a different test: two identical images, wrapped in a batch and passed to the current implementation of RandomContrast.

If the factor() is sampled per batch: The entire batch is processed in the same way = both output images should be equal.
If the factor is sampled per image in the batch: each input will be distorted differently.

import tensorflow as tf
from keras_cv.layers import RandomContrast

image = tf.random.uniform((224, 224, 3))
batch = tf.stack([image, image])  # (2, 224, 224, 3)
tf.debugging.assert_near(batch[0], batch[1])  # OK

l = RandomContrast(0.75)
outputs = l(batch)
tf.debugging.assert_near(outputs[0], outputs[1]) # InvalidArgumentError

The second assert will raise exception, indicating that the same image, in a single batch received different augmentation.

@bhack
Copy link
Contributor Author

bhack commented Feb 9, 2023

Ok so probabably it is internally wrapped in graph mode in tf.vectorized_map. We are never fully in eager mode with vectorized_map.

import tensorflow as tf

def test_map(input):
  return input["input"]+tf.random.uniform([])

def test_vectorized_map(input):
  return input["input"]+tf.random.uniform([])


input_shape = (5, 1,)
batch = tf.zeros(input_shape)
input = {"input": batch}
c = tf.map_fn(test_map, input, fn_output_signature=tf.float32)
print(c)
print("-----------------")
c = tf.vectorized_map(test_vectorized_map, input)
print(c)
tf.Tensor(
[[0.17567658]
 [0.90278184]
 [0.5280751 ]
 [0.8226601 ]
 [0.7671194 ]], shape=(5, 1), dtype=float32)
-----------------
tf.Tensor(
[[0.9258783 ]
 [0.39560175]
 [0.8852272 ]
 [0.09358847]
 [0.03888452]], shape=(5, 1), dtype=float32)

@LukeWood
Copy link
Contributor

The images are augmented sample wise - replace print with a tf.print() and you’ll get your expected result.

@bhack
Copy link
Contributor Author

bhack commented Feb 13, 2023

replace print with a tf.print() and you’ll get your expected result.

No you will not get the expected result with a tf.print. This was the point of the thread.

But the random op call inside the graph node is valid.

Check yourself with this gist:

import tensorflow as tf

def test_map(input):
  print("Standard function")
  return input["input"]+tf.random.uniform([])

def test_vectorized_map(input):  
  tf.print("Vectorized function")
  return input["input"]+tf.random.uniform([])

input_shape = (51,)
batch = tf.zeros(input_shape)
input = {"input"batch}
c = tf.map_fn(test_mapinputfn_output_signature=tf.float32)
print(c)
print("-----------------")
c = tf.vectorized_map(test_vectorized_mapinput)
print(c)
Standard function
Standard function
Standard function
Standard function
Standard function
tf.Tensor(
[[0.6856414 ]
 [0.3985964 ]
 [0.12619197]
 [0.5390271 ]
 [0.23261154]], shape=(5, 1), dtype=float32)
-----------------
Vectorized function
tf.Tensor(
[[0.5201206 ]
 [0.7133293 ]
 [0.45491397]
 [0.55868816]
 [0.9193896 ]], shape=(5, 1), dtype=float32)

@sachinprasadhs sachinprasadhs added the type:docs Improvements or additions to documentation label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:docs Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants