Skip to content

WIP Keras 3 captcha_ocr #1609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 23, 2023
Merged

Conversation

mattdangerw
Copy link
Member

Not ready to land, but opening for some questions...

@mattdangerw
Copy link
Member Author

mattdangerw commented Nov 11, 2023

@fchollet I could use your thoughts on two things here.

  1. The guide uses keras.backend.ctc_batch_cost. I have just copied the functions into the example for now (it is a little hefty, but oh well). There is also keras._legacy.backend.ctc_batch_cost, but that does not seem to export. Should it? Would using it be better?

  2. The second error seems like potentially a functional model bug! We have a model made like this

prediction_model = keras.models.Model(
    model.get_layer(name="image").input, model.get_layer(name="dense2").output
)

Which fails on predict like this...

File ~/miniconda3/envs/keras-nlp-tensorflow/lib/python3.10/site-packages/keras/src/ops/operation.py:47, in Operation.__call__(self, *args, **kwargs)
     45 if any_symbolic_tensors(args, kwargs):
     46     return self.symbolic_call(*args, **kwargs)
---> 47 return self.call(*args, **kwargs)

File ~/miniconda3/envs/keras-nlp-tensorflow/lib/python3.10/site-packages/keras/src/models/functional.py:188, in Functional.call(self, inputs, training, mask)
    186         if mask is not None:
    187             x._keras_mask = mask
--> 188 outputs = self._run_through_graph(
    189     inputs, operation_fn=lambda op: operation_fn(op, training=training)
    190 )
    191 return unpack_singleton(outputs)

File ~/miniconda3/envs/keras-nlp-tensorflow/lib/python3.10/site-packages/keras/src/ops/function.py:148, in Function._run_through_graph(self, inputs, operation_fn)
    146 output_tensors = []
    147 for x in self.outputs:
--> 148     output_tensors.append(tensor_dict[id(x)])
    150 return pack_sequence_as(self._outputs_struct, output_tensors)

KeyError: 140610153812160

Is this a valid use case? And valid bug?

Thanks!

@mattdangerw mattdangerw changed the title WIP for captcha ocr WIP Keras 3 captcha_ocr Nov 11, 2023
@fchollet
Copy link
Contributor

The second error seems like potentially a functional model bug! We have a model made like this

Yes, looks like a bug indeed. The use case looks valid as far as I can tell.

The guide uses keras.backend.ctc_batch_cost. I have just copied the functions into the example for now (it is a little hefty, but oh well). There is also keras._legacy.backend.ctc_batch_cost, but that does not seem to export. Should it? Would using it be better?

It's no longer meant to be public. The version copied from the old Keras backend also isn't too great since it relies heavily on tf.compat.v1 which we should definitely stay away from.

I think the long-term fix would be to introduce a cross-backend CTC op, maybe written from scratch. Right now none of the solutions are satisfying. Seems like a big gap tbh, CTC is still the reference for OCR problems today.

CTC was already completed left behind in TF 2, for what it's worth. Not sure why. Only TF 1 had some support (and even then it was barely usable, hence why we had these hefty backend functions to work around TF 1 APIs).

@fchollet
Copy link
Contributor

I debugged the "functional model issue" and as it turns out the framework is fine. However there was a semantic change in Keras 3 (which is more consistent now).

The code tried to query .input on an Input layer. In Keras 2 this returns the same Input. In Keras 3 this is empty (which makes more sense to me: the entry node doesn't have itself as its own entry node...). To get the model's input, better get the model's .input (instead of the model's input's .input).

So you need to modify the prediction model creation as such:

prediction_model = keras.models.Model(
    model.input[0], model.get_layer(name="dense2").output
)

the [0] is because model.input is a list of 2 input tensors (image and label).

@mattdangerw
Copy link
Member Author

Thanks! I will re-render this guide.

@mattdangerw mattdangerw marked this pull request as ready for review November 23, 2023 02:58
@mattdangerw
Copy link
Member Author

Done! All rendered.

Copy link
Contributor

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copying the old code (compat.v1 and all) is still the least bad option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants