Skip to content

fix: preserve underlined docx text#1907

Open
he-yufeng wants to merge 1 commit into
microsoft:mainfrom
he-yufeng:fix/preserve-docx-underline
Open

fix: preserve underlined docx text#1907
he-yufeng wants to merge 1 commit into
microsoft:mainfrom
he-yufeng:fix/preserve-docx-underline

Conversation

@he-yufeng
Copy link
Copy Markdown

Fixes #35

Summary

  • map DOCX underline runs to HTML u tags by default
  • keep HTML u tags in Markdown output because Markdown has no native underline syntax
  • add DOCX and HTML underline regression coverage

To verify

  • python -m pytest packages/markitdown/tests/test_module_misc.py -q -k "underline or docx_comments or docx_equations or input_as_strings"
  • python -m py_compile packages/markitdown/src/markitdown/converters/_markdownify.py packages/markitdown/src/markitdown/converters/_docx_converter.py packages/markitdown/tests/test_module_misc.py
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Underline not preserved

1 participant