Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Image element and find_embedded_image function #130

Merged
merged 25 commits into from
Jan 10, 2023
Merged

Conversation

mallorih
Copy link
Contributor

@mallorih mallorih commented Jan 5, 2023

This PR adds a new Element called Image and adds a new function to find images embedded in text. Can move this to text.py when partition_text is merged.

Testing

from unstructured.partition.email import partition_email

elements = partition_email("example-docs/email-image-embedded.eml")

@mallorih mallorih requested a review from MthwRobinson January 6, 2023 16:29
Copy link
Contributor

@MthwRobinson MthwRobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the version bumps are in! Don't forget to also remove DRAFT and add the feat: prefix. Second line in the testing instructions should also be the following (just a slight filename typo)

elements = partition_email("example-docs/email-image-embedded.eml")

@@ -5,6 +5,7 @@
* Test for `clean_ordered_bullets`.
* Test for `extract_ordered_bullets`.
* Added `partition_docx` for pre-processing Word Documents.
* Add new `Image` element and function to find embedded images `find_embedded_images`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to update the version to 0.3.6-devx and also bump the version in unstructured/__version__.py

@mallorih mallorih changed the title DRAFT: Add Image element and find_embedded_image function feat: Add Image element and find_embedded_image function Jan 9, 2023
@mallorih mallorih merged commit e0feba8 into main Jan 10, 2023
@mallorih mallorih deleted the image-element branch January 10, 2023 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants