Read and extract text and other content from PDFs in C# (port of PDFBox)
-
Updated
Nov 13, 2024 - C#
Read and extract text and other content from PDFs in C# (port of PDFBox)
Document Layout Analysis resources repos for development with PdfPig.
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
Add a description, image, and links to the layout-analysis topic page so that developers can more easily learn about it.
To associate your repository with the layout-analysis topic, visit your repo's landing page and select "manage topics."