We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compared original XML "ONB_newseye" to current line texts "AustrianNewspapers".
compare_xml.pl Version 0.01 Compare XML text output against ground truth (GRT): XML: ONB_newseye GRT: AustrianNewspapers Summary: lines words chars items ocr: 57541 326524 2198240 matches + inserts + substitutions items grt: 57541 326394 2198051 matches + deletions + substitutions matches: 23961 265356 2125325 matches edits: 33580 61346 73806 inserts + deletions + substitutions subss: 33580 60860 71835 substitutions inserts: 0 308 1080 inserts deletions: 0 178 891 deletions precision: 0.4164 0.8127 0.9668 matches / (matches + substitutions + inserts) recall: 0.4164 0.8130 0.9669 matches / (matches + substitutions + deletions) accuracy: 0.4164 0.8122 0.9664 matches / (matches + substitutions + inserts + deletions) f-score: 0.4164 0.8128 0.9669 ( 2 * recall * precision ) / (recall + precision )
Shortened list of the edits/mismatches:
Character match (confusion) table: GRT => OCR ratio errors count --- --- ------ ------- ------- 'ſ' => 's' 0.9985 56885 56971 '⸗' => '-' 0.0052 61 11639 '⸗' => '=' 0.3232 3762 11639 '⸗' => '¬' 0.6691 7788 11639 ----- SUM 68496 + transcription 1000 estimated transcription level 1 -> 2 ----- TOTAL transcription 69496 edits 73806 - transcription -69496 ----- corrections 4310 (0,20% of all characters)
Rough guess of errors still in the GRT: 1000 - 2000.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Compared original XML "ONB_newseye" to current line texts "AustrianNewspapers".
Shortened list of the edits/mismatches:
Rough guess of errors still in the GRT: 1000 - 2000.
The text was updated successfully, but these errors were encountered: