Skip to content

Issues: Unstructured-IO/unstructured

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Disable telemetry/tracking bug Something isn't working documentation Improvements or additions to documentation
#3459 opened Aug 1, 2024 by TaylorN15
Support to parse confluence wiki content enhancement New feature or request
#3457 opened Aug 1, 2024 by zjffdu
pptx: shapes "off-slide" to the right and bottom are not excluded enhancement New feature or request ppt Related to Microsoft PowerPoint (.ppt) legacy file format pptx Related to Microsoft PowerPoint (.pptx) file format
#1473 opened Sep 20, 2023 by scanny
feat/group elements by parent_id enhancement New feature or request good first issue Good for newcomers
#1489 opened Sep 21, 2023 by ron-unstructured
feat: ability to skip non-plain-text element types in chunk_by_title() chunking Related to element chunking. enhancement New feature or request
#1695 opened Oct 10, 2023 by cragwolfe
docx: partitioner finds text nested in revision-marks docx Related to Microsoft Word (.docx) file format enhancement New feature or request
#1821 opened Oct 20, 2023 by scanny
feat/retain md image links enhancement New feature or request html
#2225 opened Dec 6, 2023 by shreyanid
feat/parse_html_embed_objects enhancement New feature or request html
#2233 opened Dec 7, 2023 by My3VM
Adding a progress bar when partitioning pdfs enhancement New feature or request
#2351 opened Jan 4, 2024 by TheoLvs
feat/Option to flatten metadata extraction enhancement New feature or request
#2432 opened Jan 19, 2024 by ron-unstructured
Add possibility to deactivate OCR enhancement New feature or request
#2467 opened Jan 29, 2024 by thomascerbelaud
feat/clean_newline enhancement New feature or request good first issue Good for newcomers
#2513 opened Feb 6, 2024 by manuelrech
ocr metadata enhancement New feature or request ocr Related to optical character recognition (OCR).
#2568 opened Feb 21, 2024 by hakankaraoguz
feat/Use local model for hi_res partition enhancement New feature or request models
#2631 opened Mar 11, 2024 by AntoninLeroy
Chunk overlap prefix is on even word boundary >= overlap character count. chunking Related to element chunking. enhancement New feature or request
#2886 opened Apr 12, 2024 by scanny
Enhancement: better element ID's enhancement New feature or request
#2461 opened Jan 26, 2024 by cragwolfe
ProTip! Adding no:label will show everything without a label.