-
Notifications
You must be signed in to change notification settings - Fork 6
Add functionality to merge cells in Google OCR prediction #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: add-new-ocr-metrics
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samiuc since all of this is only required for Google OCR, please ensure that the code is kept in the google_prediction_provider.py module. We don't want this to spill into general docling-eval
utils.
Also, one comment below.
docling_eval/prediction_providers/google_prediction_provider.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my comments
return y_overlap / y_union if y_union > 0 else 0 | ||
|
||
|
||
def text_cell_to_word_dict(cell: TextCell): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please do not do this, keep using the TextCell!
@@ -42,10 +41,364 @@ | |||
|
|||
_log = logging.getLogger(__name__) | |||
|
|||
SPECIAL_CHARS = list("*:;,.?()!@#$%^&[]{}/\\\"'~+-_<>=") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont use toplevel constants, please make them part of a class!
The F1 scores regressed on Google for OCR as compared to the previous evaluations and upon investigating we found out that we had a functionality to merge cells before running the evaluations (only for Google OCR). Here's the overview of what the new code does: