-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Adding an example on handwriting recognition #594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution!
[IAM Dataset](https://fki.tic.heia-fr.ch/databases/iam-handwriting-database) | ||
that has variable length ground-truths. IAM Dataset is widely used across many OCR | ||
benchmarks so we hope this example serves as a good starting point. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a little more description of the dataset, like each sample in the dataset is an image of hand-written sentences, the prediction target of a sample is a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
""" | ||
## Introduction | ||
|
||
This example shows how the [Captcha OCR](https://keras.io/examples/vision/captcha_ocr/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May add one sentence introduction to what is OCR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is required since it's a sequel example.
""" | ||
|
||
|
||
class CTCLayer(keras.layers.Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to make the loss a Layer
instead of a loss function or Loss
subclass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our idea with this example is to keep it as close to the Captcha OCR example as possible while showing the bits that need to be changed. IMO, that helps to ensure a good reading experience. Ccing @AakashKumarNain if he has other points of view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When defining a normal loss function, we generally assume that the shape of y_true
and y_pred
are same which isn't the case here. Defining it as an endpoint layer makes it easy to calculate the length of inputs and the labels on the fly during training, and then pass the same to the ctc_batch_cost(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updates! LGTM.
@haifeng-jin, |
@sayakpaul Sure, just ping me when it's ready. Thank you! |
@haifeng-jin now, it's good to go for another round of review. We incorporated Edit Distance as an evaluation metric. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks for the update!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome example! Very strong follow-up to the original OCR example.
I added a few minor copyedits. Please add the generated files. Thanks @haifeng-jin for the review.
@fchollet added the generated files. The only major change is the reduced number of epochs because my system was getting stalled after 10 epochs. Tried on a commodity GCP Notebook instance too but didn't help much. Cc: @AakashKumarNain |
Thanks for the review @haifeng-jin @fchollet . @sayakpaul should we run this on a bigger VM (only if it is required)? |
I don't think that's required given that we have explicitly noted the number of epochs should be at least 50. However, if you want to do it go right ahead. |
Yes, I think the restriction should be fine. Thank you! |
@sayakpaul @AakashKumarNain there were only 2 image files included in the PR, but the example is intended to have 3 figures (it's missing the last one). Please add the missing figure (in a new PR) |
A Colab Notebook is available here.
Cc: @AakashKumarNain