fix(msword): use outlineLvl for heading levels and clamp to minimum 1 #2916
+149
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Documents with custom styles like "Heading 0" cause a validation error because
SectionHeaderItemrequireslevel >= 1. This PR fixes heading level extraction to use the authoritative OOXMLoutlineLvlproperty and adds defense-in-depth clamping.Changes
_get_outline_level_from_style()to extractoutlineLvlfrom the style definition. OOXMLoutlineLvlis 0-indexed (0-8 for levels 1-9), so we convert to 1-indexed heading levels (outlineLvl + 1)_get_label_and_level: first try to get the level fromoutlineLvl(the authoritative source), then fall back to parsing from style name_get_heading_and_level: clamp extracted level to minimum of 1 for defense in depth (handles custom styles like "Heading 0")_add_heading: additional defense in depth clamping before using the levelRoot Cause
The error occurred when processing a Word document with a custom style named "Heading 0":
The style had
outlineLvl w:val="0"which in OOXML correctly indicates a top-level heading (equivalent to Heading 1). However, docling was parsing the level from the style name ("Heading 0" → level 0) rather than using theoutlineLvlproperty.Test plan
_get_heading_and_leveledge cases (Heading 0, Heading 1, etc.)_get_outline_level_from_styleto verify correct extraction and conversion🤖 Generated with Claude Code