Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorized random saturation #1392

Merged
merged 4 commits into from
Feb 16, 2023
Merged

Conversation

james77777778
Copy link
Contributor

@james77777778 james77777778 commented Feb 13, 2023

What does this PR do?

Fixes #1386

The vectorized version can pass all tests in old keras_cv/layers/preprocessing/random_saturation_test.py.

Then, as discussed in #1386, I copied the old random saturation layer into keras_cv/layers/preprocessing/random_saturation_test.py and wrote two tests to check the consistency.

This is the result:

  • Input value range [0, 1]: the vectorized version can almost always pass the test with atol=1e-5, rtol=1e-5. I locally set batch size 1024 to test.
  • Input value range [0, 255]: the vectorized version can only pass the test by setting higher atol that atol=1e-3, rtol=1e-5.

I think the conversion between RGB and HSV might bring the error because the doc of tf.image.rgb_to_hsv says: https://www.tensorflow.org/api_docs/python/tf/image/rgb_to_hsv

The output is only well defined if the value in images are in [0,1].

I add a benchmark script benchmarks/vectorized_random_saturation.py to show the improvement.

comparison
comparison_no_old_eager

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you write any new necessary tests?
  • If this adds a new model, can you run a few training steps on TPU in Colab to ensure that no XLA incompatible OP are used?

Who can review?

@LukeWood

@LukeWood LukeWood self-requested a review February 13, 2023 19:34
Copy link
Contributor

@LukeWood LukeWood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few small changes.

Great PR. Thanks a lot for your hard work.

@LukeWood
Copy link
Contributor

@james77777778 thanks for the awesome contribution!

- use ellipsis to prevent dimension error in adjust_factors
- rename s_channel_batch to s_channel
- fix not implement error for augment_bounding_boxes and augment_labels
- remove serialization registration in OldRandomSaturation
@james77777778
Copy link
Contributor Author

Hi @LukeWood
Thanks for the review

I have updated the codes according to your comments and now can pass all relevant tests:

pytest keras_cv/layers/preprocessing/ragged_image_test.py
pytest keras_cv/layers/preprocessing/random_saturation_test.py
pytest keras_cv/layers/preprocessing/with_labels_test.py
pytest keras_cv/layers/preprocessing/with_segmentation_masks_test.py
pytest keras_cv/layers/preprocessing/with_mixed_precision_test.py

@james77777778
Copy link
Contributor Author

Once this PR successfully merged, I think I can take a look at RandomBrightness, RandomContrast and RandomHue to complete RandomColorJitter with similar approach?

@LukeWood
Copy link
Contributor

/gcbrun

@@ -93,3 +172,29 @@ def test_config(self):
self.assertTrue(isinstance(config["factor"], core.UniformFactorSampler))
self.assertEqual(config["factor"].get_config()["lower"], 0.3)
self.assertEqual(config["factor"].get_config()["upper"], 0.8)

def test_correctness_with_tf_adjust_saturation_normalized_range(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LukeWood I really like the pattern of A/B testing this before merging these changes, but I'm not sure it belongs in unit tests permanently. wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ianstenbit
I'm also wondering if there has better solution. I think the maintenance of old layers will be a problem if

  1. BaseImageAugmentationLayer is modified or deprecated
  2. It is implicit to have a class definition in unit tests

If the consistency of outputs is the major concern, maybe we can extract the computation part (tf.multiply, tf.image, ..., etc) instead of whole class definition for unit tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LukeWood I really like the pattern of A/B testing this before merging these changes, but I'm not sure it belongs in unit tests permanently. wdyt?

So, I see why it might not belong in unit tests permanently - but we do need a way to monitor the deltas here during code reviews. We can't really just ask contributors "did you test this" - we could ask for a colab perhaps, but I think it fits a bit more cleanly in the unit tests.

Personally, I'd opt for include in unit tests for now, then we can remove them all at the end once we feel we have achieved stability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that sgtm. Thank you Luke and thank you @james77777778 for your work on this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(One other option we could consider is a/b testing in the benchmark script).

Personally I find that a little bit nicer as then we don't have to have two copies of OldRandomSaturation wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok yeah I agree with @ianstenbit - let's put this in the benchmarks.

Copy link
Contributor Author

@james77777778 james77777778 Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LukeWood @ianstenbit
Got it. I'm going to move a/b testing into benchmarks/vectorized_random_saturation.py.

Just one question:
During the migration of vectorizing preprocessing layers, the number of benchmark script might increase rapidly. Should I make new folder to put it?

@bhack
Copy link
Contributor

bhack commented Feb 15, 2023

@LukeWood As we are accumulating benchmarks can with these PRs also add a jit_compile version in the benchmark?

If it fails the compilation we can wrap it with try/except (in benchmark) or if in test we could wrap it directly with @unittest.expectedFailure as in:

@LukeWood
Copy link
Contributor

LukeWood commented Feb 15, 2023

@LukeWood As we are accumulating benchmarks can with these PRs also add a jit_compile version in the benchmark?

If it fails the compilation we can wrap it with try/except (in benchmark) or if in test we could wrap it directly with @unittest.expectedFailure as in:

adding an XLA version in a try catch to the benchmarks sounds good to me! Feel free to raise a PR and add it if you'd like this contribution marked

@bhack
Copy link
Contributor

bhack commented Feb 15, 2023

adding an XLA version in a try catch to the benchmarks sounds good to me! Feel free to raise a PR and add it if you'd like this contribution marked

Probably it is better to request this in tests coverage of the PR as at least they are run also in CI so we could monitor continuously the compilability over the time

Benchmarks are not run by CI.

Copy link
Contributor

@ianstenbit ianstenbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but one question about where we should do numerical a/b testing (in comments)

@@ -93,3 +172,29 @@ def test_config(self):
self.assertTrue(isinstance(config["factor"], core.UniformFactorSampler))
self.assertEqual(config["factor"].get_config()["lower"], 0.3)
self.assertEqual(config["factor"].get_config()["upper"], 0.8)

def test_correctness_with_tf_adjust_saturation_normalized_range(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that sgtm. Thank you Luke and thank you @james77777778 for your work on this!

@@ -93,3 +172,29 @@ def test_config(self):
self.assertTrue(isinstance(config["factor"], core.UniformFactorSampler))
self.assertEqual(config["factor"].get_config()["lower"], 0.3)
self.assertEqual(config["factor"].get_config()["upper"], 0.8)

def test_correctness_with_tf_adjust_saturation_normalized_range(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(One other option we could consider is a/b testing in the benchmark script).

Personally I find that a little bit nicer as then we don't have to have two copies of OldRandomSaturation wdyt?

@LukeWood LukeWood merged commit b4513f3 into keras-team:master Feb 16, 2023
@LukeWood
Copy link
Contributor

Thanks you @james77777778 for the great PR!

@james77777778 james77777778 deleted the preprocess branch February 16, 2023 05:14
ghost pushed a commit to y-vectorfield/keras-cv that referenced this pull request Nov 16, 2023
* Vectorized random saturation

* Fix tests

- use ellipsis to prevent dimension error in adjust_factors
- rename s_channel_batch to s_channel
- fix not implement error for augment_bounding_boxes and augment_labels
- remove serialization registration in OldRandomSaturation

* Fix with_mixed_precision_test

* Remove serialization registration in benchmark
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vectorize RandomSaturation
4 participants