-
-
Notifications
You must be signed in to change notification settings - Fork 27
DRAFT fix: render multiple paragraphs in list items (issue #145) #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
DRAFT fix: render multiple paragraphs in list items (issue #145) #148
Conversation
Fixes issue where only the first paragraph in a list item was rendered to DOCX when multiple <p> tags were present inside an <li> element. According to HTML specification, list items can contain any Flow Content, including multiple paragraphs. This fix properly handles such cases by: 1. Adding extractParagraphNodes() helper function to recursively extract all paragraph-like elements from list items, including those nested in div containers 2. Modifying buildList() to detect when a list item contains multiple paragraph nodes and process each as a separate paragraph in the output 3. Preserving property inheritance from parent list item to child paragraphs Changes: - src/helpers/render-document-file.js: Added paragraph extraction logic - tests/list-multiple-paragraphs.test.js: Comprehensive test suite with 13 passing tests covering basic cases, styling, regression scenarios Test results: - All 342 existing tests pass (no regressions) - Main issue #145 case verified: both paragraphs now render correctly - 3 edge case tests skipped for future work (see test comments) Closes #145 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Extends the initial fix for issue #145 to handle all edge cases: 1. **Multiple sequential <li> elements with multiple <p> each** - Fixed merge logic that was incorrectly combining list items - Each <li> now processes independently 2. **Nested lists mixed with multiple paragraphs** - New separateListItemContent() helper properly extracts: * Paragraph nodes * Nested lists (ul/ol) * Other inline content - Nested lists are added back to processing queue at correct level 3. **Continuation paragraphs (OOXML-compliant)** - First paragraph in list item gets bullet/number - Subsequent paragraphs get indentation WITHOUT numbering - Matches Microsoft Word's native behavior - Implemented via buildListContinuationIndent() helper Changes: - src/helpers/render-document-file.js: * Replaced extractParagraphNodes() with separateListItemContent() * Added continuation paragraph support with isContinuation flag * Fixed merge logic to not combine list items (line 318) * Pass continuation flags to paragraph builder - src/helpers/xml-builder.js: * Added buildListContinuationIndent() helper * Modified numbering case to handle continuation paragraphs * Continuation paragraphs get <w:ind> instead of <w:numPr> - tests/list-multiple-paragraphs.test.js: * Un-skipped all edge case tests * Added 2 new tests for continuation paragraph behavior * All 18 tests now passing Test Results: - 347/347 tests pass (no regressions) - Complex scenarios verified: * Multiple <li> with multiple <p> each * Nested lists with multiple paragraphs * Mixed content (p + nested list + p) * Proper numbering/indentation throughout OOXML Compliance: - Follows Microsoft Word's standard for multi-paragraph list items - First paragraph: <w:numPr> with bullet/number - Continuation paragraphs: <w:ind> for indentation only - Proper level tracking for nested lists 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
TurboDocx DOCX Diff Report
Summary
|
Added comprehensive examples to demonstrate multiple paragraphs in list items: - example/example-node.js: Added complex nested list example - example/example.js: Added complex nested list example - example/react-example/src/App.js: Added complex nested list example Example shows: - Multiple paragraphs within single list item - Continuation paragraphs (indented without bullets) - Nested lists combined with multiple paragraphs - Proper OOXML formatting throughout When opened in Word, demonstrates: ✓ First paragraphs have bullets ✓ Continuation paragraphs indented without bullets ✓ Nested lists at correct levels ✓ All text preserved 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Extended `separateListItemContent()` to recognize and handle all block-level HTML elements within list items, not just paragraphs. This allows headings, blockquotes, pre/code blocks, and other block elements to be properly rendered. Changes: - Renamed `paragraphs` to `blockElements` for semantic accuracy - Added comprehensive block-level tag list (h1-h6, blockquote, pre, code, etc.) - Updated caller code to process block elements with continuation support - Added 9 new tests covering headings, blockquotes, pre/code, and mixed content - All tests pass (356/356) with no regressions - Fixed ESLint violations (no-restricted-syntax, no-lonely-if) Block elements in list items now support: - Headings (h1-h6) - Blockquotes (single and multi-paragraph) - Pre/code blocks - Tables, horizontal rules, definition lists - Mixed sequences with proper continuation indenting Note: Multi-line content in pre/code blocks may not preserve newlines due to existing html-to-docx rendering limitations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
| // Properties object contains CSS-style properties that should be inherited (e.g., alignment, fonts) | ||
| // This enables proper formatting when content is injected into existing document structure | ||
| for (const child of vTree) { | ||
| vTree.forEach((child) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to check to make sure this is iterable -- defensive programming
| tempVNodeObject.node, | ||
| { | ||
| numbering: { levelId: tempVNodeObject.level, numberingId: tempVNodeObject.numberingId }, | ||
| isContinuation: tempVNodeObject.isContinuation || false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comments explaining what this means for the next person
| numberingId: tempVNodeObject.numberingId, | ||
| }); | ||
| // FIX for Issue #145: Handle multiple block elements in list items | ||
| // Separate content into block elements, nested lists, and other content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain the method here more
Summary
Fixes #145 - Renders multiple paragraphs within list items correctly, matching Microsoft Word's behavior.
Problem
When a list item (
<li>) contained multiple paragraph (<p>) elements, only the first paragraph was rendered in the DOCX output. According to the HTML specification, list items can contain any Flow Content, including multiple paragraphs.Before
Result: Only "Paragraph 1" appeared in DOCX
After
Result: Both paragraphs appear correctly, with proper OOXML formatting
Solution
Implemented comprehensive support for multiple paragraphs in list items with full OOXML compliance:
1. Basic Multiple Paragraph Support
<p>tags within a list item2. Continuation Paragraph Formatting (OOXML-Compliant)
Following Microsoft Word's standard behavior:
<w:numPr>)<w:ind>only)3. Edge Cases Handled
<li>elements, each with multiple<p>tags<li><p>...</p><ul>...</ul><p>...</p></li><div>elements within list itemsChanges
Core Implementation
src/helpers/render-document-file.jsseparateListItemContent()helper to categorize list item children:<p>)<ul>,<ol>)isContinuationflagsrc/helpers/xml-builder.jsbuildListContinuationIndent()helper for proper OOXML indentation<w:ind>instead of<w:numPr>Tests
tests/list-multiple-paragraphs.test.js- Comprehensive test suite with 18 tests:Test Results
Example Output
Complex HTML:
Renders as 7 paragraphs:
Matches Microsoft Word's native behavior perfectly!
OOXML Compliance
This implementation follows the Office Open XML WordprocessingML specification:
<w:numPr>with<w:ilvl>and<w:numId><w:ind>for indentation without numberingBreaking Changes
None - all existing functionality preserved with 100% backward compatibility.
Checklist
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com