New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

make ImageDataGenerator behaviour fully seedable/repeatable #3751

Merged

fchollet merged 2 commits into keras-team:master from wassname:master

Sep 22, 2016

Contributor

wassname commented Sep 12, 2016 •

edited

Loading

This makes ImageDataGenerator fully seedable with minimal changes.

the seed argument in fit is now used
the seed argument in flow and flow_from_directory now affects
transforms
added example to docs of transforming images and masks together
added test of using two seeded streams to get the same result

Related issue: 3059

Use cases:

Needing repeatable behaviour from ImageDataGenerator
Wanting to transform images and masks at the same time, e.g. kaggle ultrasound nerve segmentation challenge
Tests pass in python2.7 with tensorflow and theano and python3.4 with theano
pep8 passes
docs updated


          make ImageDataGenerator behaviour fully seedable/repeatable

d37583e

This makes ImageDataGenerator fully seedable.
- the seed argument in fit is now used
- the seed argument in flow and flow_from_directory now effects
transforms
- added example to docs of transforming images and masks together
- added test of using two seeded streams at once

fchollet requested changes

View reviewed changes

docs/templates/preprocessing/image.md

+                  height_shift_range=0.1,
+                  zoom_range=0.2
+              )
+              generator_i = ImageDataGenerator(**data_gen_args)

Collaborator

fchollet Sep 20, 2016

Use explicit variable names in examples (e.g. image_generator)

docs/templates/preprocessing/image.md

+              # We need to provide the same seed to the fit and flow methods
+              seed = 1
+              fake_y = np.zeroes(images.shape)

Collaborator

fchollet Sep 20, 2016

If you don't need labels, you should configure the generator not to use labels (class_mode=None).

docs/templates/preprocessing/image.md

+              # combine the two generators into one which gives the first from each
+              def dual_gen(flow_images,flow_masks):
+                  for [X,_],[y,_] in zip(flow_images,flow_masks):

Collaborator

fchollet Sep 20, 2016

Same here, would be much more elegant without the fake labels.

docs/templates/preprocessing/image.md

+              # combine the two generators into one which gives the first from each
+              def dual_gen(flow_images,flow_masks):
+                  for [X,_],[y,_] in zip(flow_images,flow_masks):
+                      yield X,y

Collaborator

fchollet Sep 20, 2016

PEP8

docs/templates/preprocessing/image.md

+                      yield X,y
+              model.fit_generator(
+                      dual_gen(flow_images,flow_masks),

Collaborator

fchollet Sep 20, 2016

PEP8

tests/keras/preprocessing/test_image.py

@@ @@ -90,6 +90,55 @@ def test_img_flip(self): @@
                               assert ((potentially_flipped_x == x).all() or
                                       (potentially_flipped_x == flip_axis(x, col_index)).all())
+                  def test_dual_generators(self):

Collaborator

fchollet Sep 20, 2016

I don't think this test is useful; please remove.

Contributor Author

wassname Sep 21, 2016

My logic is that this test will break if any future PR break's repeatability, if you have time could you expand on why you don't think that's worth a test? In the mean time I'll remove it.

Contributor Author

wassname commented Sep 21, 2016

Thanks for the feedback, I'll make those changes shortly


          implemented requested changes

80f7268

- PEP8
- explicit names
- classes=None
- remove test

Collaborator

fchollet commented Sep 22, 2016

LGTM, thanks

fchollet merged commit 414d5f0 into keras-team:master

wassname mentioned this pull request

List files in alphabetical order #3871

Merged

yesufeng commented Mar 26, 2017

does this requires the transformations passed to ImageDataGenerator to be exactly the same for image and mask?

Contributor Author

wassname commented Mar 26, 2017 •

edited

Loading

If our goal is to transform the image and mask together, then yes, and that's usually what we want. Usually we want the mask to correspond to the same pixels in the image even after transformation. Unless I'm missing something.

See the last example in the docs where they are transformed together.

yesufeng commented Mar 26, 2017

yes, I saw that example. Let me give a concrete example, if the images are rescaled but not the mask, e.g., imageGen = ImageDataGenerator(rescale = 1/1000., cval = -1000., fill_mode = 'constant', rotation_range=180.)
maskGen = ImageDataGenerator(cval=0., fill_mode='constant', rotation_range=180.),
these two generators, when we call flow with the same seed and other keyword arguments, then zip them to a final generator, is it guaranteed that these two are synchronized?

Contributor Author

wassname commented Mar 27, 2017 •

edited

Loading

In your example they wont be.

But instead of taking my word, you can easily test this, and other combinations yourself. Put the same image through both generators and compare them visually or with assert (a==b).all(). Example

Let's say I channel shift the image and not the mask. This change will propagate to other transformations, in this case the vertical flip, and the image and mask will not be synchronized. To me that was unexpected, but it's shown in the example. I expected it to be mostly synchronized but it wasn't

So to be clear, I don't expect them to ever be synchronized, unless the seed and transformation keywords are exactly the same. And only then.

yesufeng commented Mar 27, 2017

thank you for the explanation! Yeah, feed in the same images will nail it down. It might be desirable to allow things such as rescale to be different for mask vs image without affecting the synchronized generator, for obvious reason.

Contributor Author

wassname commented Mar 27, 2017

No worries. Yeah agreed.

wassname mentioned this pull request

ImageDataGenerator wassname/pipe-segmentation#1

Closed

wassname added a commit to wassname/keras that referenced this pull request


          make imagedata generator repeatable

54cf36b

Following keras-team#3751. I ran more tests to see if ImageDataGenerator generator can transform images and masks in the same way. It can for NumpyArrayIterator but not for DirectoryIterator. 

So I added seeds to fix this and now it can transform images and masks together. I have unit tests to prove this, and I can add them to the PR if desired?

wassname added a commit to wassname/keras that referenced this pull request


          Update image.py

07198f4

Following keras-team#3751. I ran more tests to see if ImageDataGenerator generator can transform images and masks in the same way. It can for NumpyArrayIterator but not for DirectoryIterator. 

So I added seeds to fix this and now it can transform images and masks together. I have unit tests to prove this, and I can add them to the PR if desired?

wassname added a commit to wassname/keras that referenced this pull request


          make ImageDataGenerator generator repeatable

d746b29

Following keras-team#3751. I ran more tests to see if ImageDataGenerator generator can transform images and masks in the same way. It can for NumpyArrayIterator but not for DirectoryIterator. 

So I added seeds to fix this and now it can transform images and masks together. I have unit tests to prove this, and I can add them to the PR if desired?

wassname mentioned this pull request

make ImageDataGenerator generator repeatable #7613

Closed

mositemp commented Apr 28, 2018 •

edited

Loading

@yesufeng , @wassname , Did you made any progress in making image and mask generator synchronized when two different pre-processing pipelines are selected for each of them?
I running some experiments and it seems (in my case) that "channel_shift_range" option is the culprit here! Every time I enable it (even for both mask and image generators), generator outputs would not be synchronous.
I think this option is need to be fully synchronized and separable. There are cases that more than two generators are needed (e.g. two or more output mask or using pixel-wise weighting) where equal pre-processings are not suitable for all of theme.

Contributor Author

wassname commented Apr 29, 2018

@mositemp I made a quick fix here but we decided not to merge it.

Good info on "channel_shift_range" thanks for sharing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet