BA-thesis in history.
-
Updated
Jul 13, 2017 - Python
BA-thesis in history.
A more complete example of programming with PDFMiner, which continues where the default documentation stops
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
OCR-D wrapper for page-xml-draw
A powerful CLI tool for visualization and encoding of PAGE-XML files
OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)
An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"
A python package to structure files using visual and style informations
PdfDet aims to simplify PDF layout detect tasks for users.
A Unified Toolkit for Deep Learning Based Document Image Analysis
Automated Election Vote Counting
YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
Analysis of Chinese and English layouts 中英文版面分析
📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。
OCR engine for all the languages
A toolbox of ocr models and algorithms based on MindSpore
Add a description, image, and links to the layout-analysis topic page so that developers can more easily learn about it.
To associate your repository with the layout-analysis topic, visit your repo's landing page and select "manage topics."