-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make ImageDataGenerator behaviour fully seedable/repeatable #3751
Conversation
This makes ImageDataGenerator fully seedable. - the seed argument in fit is now used - the seed argument in flow and flow_from_directory now effects transforms - added example to docs of transforming images and masks together - added test of using two seeded streams at once
height_shift_range=0.1, | ||
zoom_range=0.2 | ||
) | ||
generator_i = ImageDataGenerator(**data_gen_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use explicit variable names in examples (e.g. image_generator
)
|
||
# We need to provide the same seed to the fit and flow methods | ||
seed = 1 | ||
fake_y = np.zeroes(images.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't need labels, you should configure the generator not to use labels (class_mode=None
).
|
||
# combine the two generators into one which gives the first from each | ||
def dual_gen(flow_images,flow_masks): | ||
for [X,_],[y,_] in zip(flow_images,flow_masks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, would be much more elegant without the fake labels.
# combine the two generators into one which gives the first from each | ||
def dual_gen(flow_images,flow_masks): | ||
for [X,_],[y,_] in zip(flow_images,flow_masks): | ||
yield X,y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8
yield X,y | ||
|
||
model.fit_generator( | ||
dual_gen(flow_images,flow_masks), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8
@@ -90,6 +90,55 @@ def test_img_flip(self): | |||
assert ((potentially_flipped_x == x).all() or | |||
(potentially_flipped_x == flip_axis(x, col_index)).all()) | |||
|
|||
def test_dual_generators(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test is useful; please remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My logic is that this test will break if any future PR break's repeatability, if you have time could you expand on why you don't think that's worth a test? In the mean time I'll remove it.
Thanks for the feedback, I'll make those changes shortly |
- PEP8 - explicit names - classes=None - remove test
LGTM, thanks |
does this requires the transformations passed to ImageDataGenerator to be exactly the same for image and mask? |
If our goal is to transform the image and mask together, then yes, and that's usually what we want. Usually we want the mask to correspond to the same pixels in the image even after transformation. Unless I'm missing something. See the last example in the docs where they are transformed together. |
yes, I saw that example. Let me give a concrete example, if the images are rescaled but not the mask, e.g., imageGen = ImageDataGenerator(rescale = 1/1000., cval = -1000., fill_mode = 'constant', rotation_range=180.) |
In your example they wont be. But instead of taking my word, you can easily test this, and other combinations yourself. Put the same image through both generators and compare them visually or with Let's say I channel shift the image and not the mask. This change will propagate to other transformations, in this case the vertical flip, and the image and mask will not be synchronized. To me that was unexpected, but it's shown in the example. I expected it to be mostly synchronized but it wasn't So to be clear, I don't expect them to ever be synchronized, unless the seed and transformation keywords are exactly the same. And only then. |
thank you for the explanation! Yeah, feed in the same images will nail it down. It might be desirable to allow things such as rescale to be different for mask vs image without affecting the synchronized generator, for obvious reason. |
No worries. Yeah agreed. |
Following keras-team#3751. I ran more tests to see if ImageDataGenerator generator can transform images and masks in the same way. It can for NumpyArrayIterator but not for DirectoryIterator. So I added seeds to fix this and now it can transform images and masks together. I have unit tests to prove this, and I can add them to the PR if desired?
Following keras-team#3751. I ran more tests to see if ImageDataGenerator generator can transform images and masks in the same way. It can for NumpyArrayIterator but not for DirectoryIterator. So I added seeds to fix this and now it can transform images and masks together. I have unit tests to prove this, and I can add them to the PR if desired?
Following keras-team#3751. I ran more tests to see if ImageDataGenerator generator can transform images and masks in the same way. It can for NumpyArrayIterator but not for DirectoryIterator. So I added seeds to fix this and now it can transform images and masks together. I have unit tests to prove this, and I can add them to the PR if desired?
@yesufeng , @wassname , Did you made any progress in making image and mask generator synchronized when two different pre-processing pipelines are selected for each of them? |
This makes ImageDataGenerator fully seedable with minimal changes.
transforms
Related issue: 3059
Use cases: