A Node.js CLI tool for applying tracked changes and comments to DOCX files using SuperDoc in headless mode.
Designed for use by AI agents (Claude, GPT, etc.) in IDE environments like Cursor, VS Code, or Claude Code.
Sped-up 30s demo of multi-agentic contract review of a 100+ page contract using Claude Code, this library and the CONTRACT-REVIEW-AGENTIC-SKILL.md workflow — click to see the full-length video.
For AI Agents: See SKILL.md for the concise task-oriented guide with decision flows, constraints, and expected outputs.
- Structured IR - Extract stable block IDs from any DOCX document
- ID-Based Edits - Deterministic edits that don't depend on fragile text matching
- Auto-Chunking - Handles documents of any size with token-aware chunking
- Multi-Agent Support - Merge edits from parallel sub-agents with conflict resolution
- Track Changes - Word-level diff produces minimal, reviewable changes
- Comments - Attach comments to any block for review
- Text-Span Annotations - Anchor comments and highlights to specific text within blocks (sub-block granularity)
git clone https://github.com/yuch85/superdoc-redlines
cd superdoc-redlines
npm install# 1. Extract document structure (get block IDs)
node superdoc-redline.mjs extract --input contract.docx
# 2. Read document (for LLM analysis)
node superdoc-redline.mjs read --input contract.docx
# 3. Create edits.json referencing block IDs
# 4. Validate edits (optional but recommended)
node superdoc-redline.mjs validate --input contract.docx --edits edits.json
# 5. Apply edits with track changes
node superdoc-redline.mjs apply --input contract.docx --output redlined.docx --edits edits.jsonExtract structured intermediate representation (IR) from a DOCX file.
node superdoc-redline.mjs extract --input doc.docx --output ir.json
node superdoc-redline.mjs extract -i doc.docx -o ir.json --max-text 100
node superdoc-redline.mjs extract -i doc.docx --no-defined-termsOptions:
| Option | Description |
|---|---|
-i, --input <path> |
Input DOCX file (required) |
-o, --output <path> |
Output JSON file (default: <input>-ir.json) |
-f, --format <type> |
Output format: full|outline|blocks (default: full) |
--no-defined-terms |
Exclude defined terms extraction |
--max-text <length> |
Truncate block text to length |
Read document for LLM consumption with automatic chunking.
node superdoc-redline.mjs read --input doc.docx
node superdoc-redline.mjs read -i doc.docx --chunk 1 --max-tokens 50000
node superdoc-redline.mjs read -i doc.docx --stats-only
node superdoc-redline.mjs read -i doc.docx -f outlineOptions:
| Option | Description |
|---|---|
-i, --input <path> |
Input DOCX file (required) |
-c, --chunk <index> |
Specific chunk index (0-indexed) |
--max-tokens <count> |
Max tokens per chunk (default: 100000) |
-f, --format <type> |
Output format: full|outline|summary (default: full) |
--stats-only |
Only show document statistics |
--no-metadata |
Exclude block IDs and positions |
Output (JSON to stdout):
{
"success": true,
"totalChunks": 1,
"currentChunk": 0,
"hasMore": false,
"nextChunkCommand": null,
"document": {
"metadata": { "filename": "doc.docx", "chunkIndex": 0, "totalChunks": 1 },
"outline": [...],
"blocks": [...],
"idMapping": { "uuid-here": "b001" }
}
}Validate edit instructions against a document.
node superdoc-redline.mjs validate --input doc.docx --edits edits.jsonOptions:
| Option | Description |
|---|---|
-i, --input <path> |
Input DOCX file (required) |
-e, --edits <path> |
Edits JSON file (required) |
Exit code: 0 if valid, 1 if issues found.
Apply ID-based edits to a document with track changes.
node superdoc-redline.mjs apply --input doc.docx --output redlined.docx --edits edits.json
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --author-name "Reviewer"
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --no-track-changes
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --verbose # Debug position mapping
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --strict # Fail on truncation warningsOptions:
| Option | Description |
|---|---|
-i, --input <path> |
Input DOCX file (required) |
-o, --output <path> |
Output DOCX file (required) |
-e, --edits <path> |
Edits JSON file (required) |
--author-name <name> |
Author name for track changes (default: "AI Assistant") |
--author-email <email> |
Author email (default: "ai@example.com") |
--no-track-changes |
Disable track changes mode |
--no-validate |
Skip validation before applying |
--no-sort |
Skip automatic edit sorting |
-v, --verbose |
Enable verbose logging for debugging position mapping |
--strict |
Treat truncation/corruption warnings as errors |
--skip-invalid |
Skip invalid edits instead of failing |
-q, --quiet-warnings |
Suppress content reduction warnings |
--allow-reduction |
Allow intentional content reduction without warnings (for jurisdiction conversions) |
Content reduction: A replace edit where newText is significantly shorter than the original block. This can be intentional (simplification) or a sign of truncation/corruption. Do not allow reduction for normal proofreading or minor edits where text length should stay similar.
Validation:
The apply command automatically validates edits before applying, including:
- Block ID existence checks
- Required field validation (newText for replace, etc.)
- newText truncation detection - Warns if newText appears truncated or corrupted
- Content corruption detection - Detects patterns that suggest LLM output issues
Use --strict to fail the apply if any warnings are detected.
Merge edit files from multiple sub-agents.
node superdoc-redline.mjs merge edits-a.json edits-b.json -o merged.json
node superdoc-redline.mjs merge *.json -o merged.json --conflict first
node superdoc-redline.mjs merge edits-*.json -o merged.json -v doc.docxOptions:
| Option | Description |
|---|---|
<files...> |
Edit files to merge (required, positional) |
-o, --output <path> |
Output merged edits file (required) |
-c, --conflict <strategy> |
Conflict strategy: error|first|last|combine (default: error) |
-n, --normalize |
Normalize field names from common variants (type→operation, etc.) |
-v, --validate <docx> |
Validate merged edits against document |
Conflict Strategies:
error- Fail if any conflicts (safe default)first- Keep first edit (by file order)last- Keep last edit (by file order)combine- Merge comments; usefirstfor other operations
Convert markdown edits to JSON format. This enables a more resilient edit format that is easier for LLMs to generate without syntax errors.
node superdoc-redline.mjs parse-edits -i edits.md -o edits.json
node superdoc-redline.mjs parse-edits -i edits.md -o edits.json --validate doc.docxOptions:
| Option | Description |
|---|---|
-i, --input <file> |
Input markdown file (.md) (required) |
-o, --output <file> |
Output JSON file (.json) (required) |
--validate <docx> |
Validate block IDs against document |
Convert JSON edits to markdown format for human review or editing.
node superdoc-redline.mjs to-markdown -i edits.json -o edits.mdOptions:
| Option | Description |
|---|---|
-i, --input <file> |
Input JSON file (.json) (required) |
-o, --output <file> |
Output markdown file (.md) (required) |
Find blocks by text content. Useful for locating block IDs when creating edits.
node superdoc-redline.mjs find-block --input contract.docx --text "VAT"
node superdoc-redline.mjs find-block --input contract-ir.json --regex "VAT|HMRC"
node superdoc-redline.mjs find-block -i contract.docx -t "Business Day" -c 100
node superdoc-redline.mjs find-block -i contract.docx -t "VAT" --limit 100
node superdoc-redline.mjs find-block -i contract.docx -t "VAT" --limit allOptions:
| Option | Description |
|---|---|
-i, --input <path> |
Input DOCX file or IR JSON file (required) |
-t, --text <text> |
Text to search for (case-insensitive) |
-r, --regex <pattern> |
Regex pattern to search for |
-c, --context <chars> |
Context characters to show around match (default: 50) |
-l, --limit <n> |
Maximum results (default: 20, use "all" for unlimited) |
Note: The default limit of 20 results may miss edits for high-frequency terms. For comprehensive discovery, use --limit all.
Example output:
Found 3 block(s):
[b051] (paragraph)
ID: c9742b66-c68b-4a5b-92aa-b742b141425e
Preview: ...Capital allowances923.Stamp duty and SDLT924.VAT925.Inheritance tax936.[Distraint by HMRC]93Schedu...
[b129] (heading)
ID: 603fbb12-bb15-428c-985a-336afff6b4a5
Preview: 4.VAT92
Recompress a DOCX file to reduce file size. SuperDoc writes uncompressed DOCX files (~6x larger than normal).
node superdoc-redline.mjs recompress --input bloated.docx
node superdoc-redline.mjs recompress -i bloated.docx -o clean.docxOptions:
| Option | Description |
|---|---|
-i, --input <path> |
Input DOCX file (required) |
-o, --output <path> |
Output DOCX file (default: overwrites input) |
Example output:
Recompressing: bloated.docx
Original size: 2560.0 KB
Compressed size: 384.0 KB
Reduction: 85%
Output: clean.docx
Note: Requires archiver and unzipper packages. Install with: npm install archiver unzipper
AI Agents: See SKILL.md for JSON Schema, common mistakes, and critical constraints.
{
"version": "0.3.0",
"author": {
"name": "AI Counsel",
"email": "ai@firm.com"
},
"edits": [
{
"blockId": "b001",
"operation": "replace",
"newText": "Modified clause text",
"comment": "Optional comment",
"diff": true
},
{
"blockId": "b015",
"operation": "delete",
"comment": "Removing redundant clause"
},
{
"blockId": "b020",
"operation": "comment",
"comment": "Needs legal review"
},
{
"afterBlockId": "b025",
"operation": "insert",
"text": "New paragraph content",
"type": "paragraph"
}
]
}These operations anchor to specific text within a block using findText for sub-block granularity:
{
"version": "0.3.0",
"author": { "name": "AI Counsel", "email": "ai@firm.com" },
"edits": [
{
"blockId": "b020",
"operation": "comment",
"findText": "Material Adverse Change",
"comment": "Definition is too broad"
},
{
"blockId": "b035",
"operation": "insertAfterText",
"findText": "reasonable endeavours",
"insertText": " (acting in good faith)"
},
{
"blockId": "b042",
"operation": "highlight",
"findText": "unlimited liability",
"color": "#FF6B6B",
"comment": "Flag for partner review"
},
{
"blockId": "b050",
"operation": "commentRange",
"findText": "governing law shall be England",
"comment": "Should this be Singapore?"
},
{
"blockId": "b060",
"operation": "commentHighlight",
"findText": "entire agreement",
"comment": "Check against side letter",
"color": "#FFEB3B"
}
]
}| Operation | Required Fields | Optional Fields | Description |
|---|---|---|---|
replace |
blockId, newText |
comment, diff |
Replace block content |
delete |
blockId |
comment |
Delete block entirely |
comment |
blockId, comment |
findText |
Add comment to block (or to specific text if findText provided) |
insert |
afterBlockId, text |
type, level, comment |
Insert new block |
insertAfterText |
blockId, findText, insertText |
comment |
Insert text immediately after matched text within a block |
highlight |
blockId, findText |
color, comment |
Highlight specific text within a block |
commentRange |
blockId, findText, comment |
- | Add comment anchored to specific text span |
commentHighlight |
blockId, findText, comment |
color |
Highlight text and attach a comment (atomic) |
| Field | Type | Description |
|---|---|---|
blockId |
string | Target block (use seqId like "b001"; UUIDs from extract are session-volatile and will fail in apply/validate) |
afterBlockId |
string | Insert position (use seqId like "b001"; UUIDs are not portable across CLI commands) |
newText |
string | Replacement text for replace operation |
text |
string | New block text for insert operation |
comment |
string | Comment text to attach |
diff |
boolean | Use word-level diff for minimal changes (default: true) |
type |
string | Block type for insert: paragraph|heading|listItem |
level |
number | Heading level for insert (1-6) |
findText |
string | Exact text to match within a block (for text-span operations) |
insertText |
string | Text to insert after matched text (insertAfterText only) |
color |
string | Highlight color as hex string (default: "#FFEB3B") |
findTextmatching uses exact string matching within the block's text contentcommentwithfindTextgracefully falls back to a full-block comment if the text is not foundcommentRangealso falls back to full-block comment on match failure- Sort order: When multiple text-span operations target the same block, they are applied rightmost-first to prevent position corruption
- Conflict detection: Text-span operations use composite keys (
blockId::operation::findText) so multiple annotations on different text spans within the same block do not conflict
For large edit sets, the markdown format is more reliable than JSON because:
- No syntax errors from missing commas or quotes
- Partial output is still parseable (truncation recovery)
- Lower cognitive load during generation
- Human-readable for review
The markdown format supports two table layouts: a 4-column format for basic operations and a 6-column extended format when text-span operations are used.
4-Column Format (block-level operations only):
# Edits: [Document Name]
## Metadata
- **Version**: 0.3.0
- **Author Name**: AI Legal Counsel
- **Author Email**: ai@counsel.sg
## Edits Table
| Block | Op | Diff | Comment |
|-------|-----|------|---------|
| b257 | delete | - | DELETE TULRCA definition |
| b165 | replace | true | Change Business Day to Singapore |
| b500 | comment | - | Review needed |
| b449 | insert | - | Insert new clause |
## Replacement Text
### b165 newText
Business Day: a day other than a Saturday, Sunday or public holiday in Singapore when banks in Singapore are open for business.
### b449 insertText
The Buyer shall offer employment to each Transferring Employee.6-Column Format (when using text-span operations):
## Edits Table
| Block | Op | FindText | Color | Diff | Comment |
|-------|-----|----------|-------|------|---------|
| b165 | replace | - | - | true | Change Business Day |
| b020 | comment | Material Adverse Change | - | - | Definition too broad |
| b035 | insertAfterText | reasonable endeavours | - | - | (acting in good faith) |
| b042 | highlight | unlimited liability | #FF6B6B | - | Flag for review |
| b050 | commentRange | governing law | - | - | Should be Singapore? |
| b060 | commentHighlight | entire agreement | #FFEB3B | - | Check side letter |The format is auto-detected: if any edit uses findText, color, or a text-span operation, the 6-column format is used. For insertAfterText, the Comment column contains the text to insert.
4-Column Format:
| Column | Required | Values | Description |
|---|---|---|---|
| Block | Yes | b### |
Block ID from document IR |
| Op | Yes | delete, replace, comment, insert |
Operation type |
| Diff | For replace | true, false, - |
Word-level diff mode |
| Comment | No | Free text | Rationale for edit |
6-Column Format (extends 4-column):
| Column | Required | Values | Description |
|---|---|---|---|
| Block | Yes | b### |
Block ID from document IR |
| Op | Yes | All 8 operation types | Operation type |
| FindText | For text-span ops | Exact text string, - |
Text to match within block |
| Color | For highlight ops | Hex color string, - |
Highlight color |
| Diff | For replace | true, false, - |
Word-level diff mode |
| Comment | No | Free text | Rationale (or insertText for insertAfterText) |
### b### newText- Replacement text forreplaceoperations### b### insertText- New content forinsertoperations
# Convert markdown to JSON
node superdoc-redline.mjs parse-edits -i edits.md -o edits.json
# Apply directly from markdown (auto-detects format)
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.md
# Convert existing JSON to markdown for review
node superdoc-redline.mjs to-markdown -i edits.json -o edits.mdTrack changes is enabled by default. All edits appear as native Word revisions when you open the output file in Microsoft Word:
- Insertions - Shown as underlined additions
- Deletions - Shown as strikethrough removals
- Changes attributed to the author you specify (default: "AI Assistant")
# Track changes ON by default
node superdoc-redline.mjs apply -i doc.docx -o redlined.docx -e edits.jsonnode superdoc-redline.mjs apply -i doc.docx -o redlined.docx -e edits.json \
--author-name "Legal Review Bot" \
--author-email "review@firm.com"For direct edits without revision marks:
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --no-track-changesBy default, replace operations use word-level diff to produce minimal tracked changes. Only the words that actually changed are marked as insertions/deletions, not the entire block.
To replace the entire block content (useful for complete rewrites):
{
"blockId": "b025",
"operation": "replace",
"newText": "Completely new text here",
"diff": false
}The intermediate representation (IR) extracted from a document:
{
"metadata": {
"version": "0.2.0",
"filename": "contract.docx",
"format": "full",
"extractedAt": "2026-02-04T10:00:00.000Z"
},
"outline": [
{
"title": "1. Definitions",
"level": 1,
"blockId": "uuid-here",
"seqId": "b001",
"children": []
}
],
"blocks": [
{
"id": "uuid-here",
"seqId": "b001",
"type": "heading",
"level": 1,
"text": "1. Definitions",
"startPos": 0,
"endPos": 15
}
],
"idMapping": {
"uuid-here": "b001"
},
"definedTerms": {
"Agreement": { "blockId": "b001", "text": "..." }
}
}Each block has two IDs:
| ID Type | Format | Purpose |
|---|---|---|
| UUID | 550e8400-e29b-41d4-a716-446655440000 |
SuperDoc internal ID, regenerated on each document load — not portable across CLI commands |
| seqId | b001, b002, b003 |
Stable sequential ID for LLM reference — use this in edit files |
Always use seqId in edit files. UUIDs are session-volatile — they are regenerated on each document load and will fail when used across separate CLI invocations (e.g., extract → apply). UUIDs are accepted for backward compatibility within a single programmatic session, but this is deprecated and will emit a warning.
Large documents are automatically split into chunks for LLM context limits.
- Token estimation: ~4 characters per token
- Default chunk size: 100,000 tokens
- Chunks break at heading boundaries when possible
- Every chunk includes the full document outline for context
# Check document size
node superdoc-redline.mjs read --input large.docx --stats-only
# Read first chunk (includes navigation info)
node superdoc-redline.mjs read --input large.docx --chunk 0
# Output includes nextChunkCommand for sequential reading{
"success": true,
"totalChunks": 3,
"currentChunk": 0,
"hasMore": true,
"nextChunkCommand": "node superdoc-redline.mjs read --input \"large.docx\" --chunk 1",
"document": {
"metadata": { "blockRange": { "start": "b001", "end": "b100" } },
"outline": [...],
"blocks": [...]
}
}For parallel review by multiple agents:
# 1. Extract IR (shared by all agents)
node superdoc-redline.mjs extract -i contract.docx -o contract-ir.json
# 2. Sub-agents produce edit files
# Agent A -> edits-definitions.json
# Agent B -> edits-warranties.json
# Agent C -> edits-govlaw.json
# 3. Merge all edits (use --normalize if sub-agents use inconsistent field names)
node superdoc-redline.mjs merge \
edits-definitions.json \
edits-warranties.json \
edits-govlaw.json \
-o merged-edits.json \
-c error \
--normalize \
-v contract.docx
# 4. Apply merged edits (use --skip-invalid to continue past bad edits)
node superdoc-redline.mjs apply \
-i contract.docx \
-o redlined.docx \
-e merged-edits.json \
--skip-invalid
⚠️ Important: Don't assign sequential block ranges (b001-b300, b301-b600, etc.) without considering clause type distribution.
Legal documents have clause types scattered throughout - governing law clauses may appear in definitions, main provisions, and schedules. Sequential assignment will miss edits when agents don't have all blocks for their assigned clause types.
Recommended approach:
- During discovery, map clause types to actual block locations
- Assign agents by clause type grouping, not sequential ranges
- Include overlap buffer zones for ambiguous boundaries
See CONTRACT-REVIEW-AGENTIC-SKILL.md for detailed guidance on multi-agent orchestration.
For programmatic use:
import { extractDocumentIR, createEditorWithIR } from './src/irExtractor.mjs';
// Extract IR from file
const ir = await extractDocumentIR('contract.docx');
console.log(`${ir.blocks.length} blocks`);
// Get IR with editor for block operations
const { editor, ir, cleanup } = await createEditorWithIR('contract.docx');
// ... use editor and ir ...
cleanup();import {
replaceBlockById,
deleteBlockById,
insertAfterBlock,
addCommentToBlock,
findTextPositionInBlock,
insertTextAfterMatch,
highlightTextInBlock,
addCommentToTextInBlock
} from './src/blockOperations.mjs';
// With word-level diff
await replaceBlockById(editor, 'b001', 'New text');
// Delete a block
await deleteBlockById(editor, 'b005');
// Insert after a block
await insertAfterBlock(editor, 'b010', 'New paragraph');
// Add comment to whole block
await addCommentToBlock(editor, 'b015', 'Needs review');// Find text position within a block
const pos = findTextPositionInBlock(editor, 'b020', 'Material Adverse Change');
// => { found: true, from: 142, to: 166 }
// Insert text after a match
await insertTextAfterMatch(editor, 'b035', 'reasonable endeavours', ' (acting in good faith)');
// Highlight specific text (default yellow)
await highlightTextInBlock(editor, 'b042', 'unlimited liability');
// Highlight with custom color
await highlightTextInBlock(editor, 'b042', 'unlimited liability', '#FF6B6B');
// Add comment anchored to specific text
await addCommentToTextInBlock(editor, 'b050', 'governing law', 'Should this be Singapore?');import { applyEdits, validateEdits } from './src/editApplicator.mjs';
// Validate first
const validation = await validateEdits('contract.docx', editConfig);
if (!validation.valid) {
console.error(validation.issues);
}
// Apply edits
const result = await applyEdits('contract.docx', 'redlined.docx', editConfig);
console.log(`Applied: ${result.applied}, Skipped: ${result.skipped.length}`);import { readDocument, getDocumentStats } from './src/documentReader.mjs';
// Get stats
const stats = await getDocumentStats('contract.docx');
console.log(`Blocks: ${stats.blockCount}, Tokens: ${stats.estimatedTokens}`);
// Read with chunking
const result = await readDocument('contract.docx', { chunkIndex: 0 });import { mergeEditFiles, normalizeEdit, splitBlocksForAgents } from './src/editMerge.mjs';
// Split work for sub-agents
const ranges = splitBlocksForAgents(ir, 3);
// Merge results (with field name normalization)
const result = await mergeEditFiles(['a.json', 'b.json'], {
conflictStrategy: 'error',
normalize: true // Fixes type→operation, replacement→newText, etc.
});
// Normalize a single edit manually
const normalized = normalizeEdit({
blockId: 'b001',
type: 'replace', // Will become 'operation'
replacement: 'text', // Will become 'newText'
search: 'some text', // Will become 'findText'
highlightColor: '#FF0' // Will become 'color'
});Package version changed from 1.x to 0.2.0 to signal breaking API changes.
| v1.x Feature | Status | Migration |
|---|---|---|
Text-based find/replace edits |
REMOVED | Use ID-based blockId edits |
--inline JSON argument |
REMOVED | Use --edits file path |
--config option |
REMOVED | Use subcommand + --edits |
| Fuzzy text matching for edits | INTERNAL | Now used only for IR extraction |
| Clause targeting by text | REMOVED | Use blockId from extracted IR |
# OLD (v1.x) - Text-based
node superdoc-redline.mjs --config edits.json
node superdoc-redline.mjs --inline '{"input":"doc.docx","edits":[...]}'
# NEW (v0.2.0) - Subcommands
node superdoc-redline.mjs extract --input doc.docx
node superdoc-redline.mjs read --input doc.docx
node superdoc-redline.mjs validate --input doc.docx --edits edits.json
node superdoc-redline.mjs apply --input doc.docx --output out.docx --edits edits.json
node superdoc-redline.mjs merge edits1.json edits2.json --output merged.json- Extract IR to get block IDs:
extract --input doc.docx - Rewrite edits to use
blockIdinstead offind - Use
applysubcommand instead of root command
Issue: Output DOCX files are ~6x larger than expected (~2.5MB instead of ~400KB).
Cause: The JSZip library uses ZIP_STORED (no compression) by default.
Solution: Use the recompress command:
node superdoc-redline.mjs recompress --input bloated.docx --output clean.docxAlternatively, see CONTRACT-REVIEW-SKILL.md "Step 5: Recompress Output File" for a Python script.
Fixed in v0.2.0: The --stats-only output now respects --max-tokens and provides multiple recommendations:
node superdoc-redline.mjs read --stats-only --input doc.docx --max-tokens 10000Output includes:
recommendedChunks- Based on your specified--max-tokens(or default 100k)recommendedChunksByLimit- Recommendations for common limits (10k, 25k, 40k, 100k)maxTokensUsed- The limit used for the primary calculation
Issue: When extracting IR from an amended document (with track changes), both deleted and inserted text appear concatenated.
Cause: Track changes preserves deleted text in the document structure; IR extraction captures all text content.
This is expected behavior. To get only the final text, open the DOCX in Word, accept all changes, save, then re-extract.
Issue: Using a UUID from extract output as a blockId in an edit file causes "Block not found" errors during apply or validate.
Cause: SuperDoc regenerates UUIDs on each document load. The UUID from one extract session will not exist in a later apply session.
Solution: Always use seqId values (e.g., b001, b025) in edit files. SeqIds are derived from document order and are stable across CLI invocations.
Issue: Comment edits (comment, commentRange, commentHighlight) were applied successfully (correct count reported) but the output DOCX had empty comment bodies and missing commentRangeStart/commentRangeEnd anchors — so no visible comment bubbles appeared in Word or LibreOffice.
Cause: The internal commentsStore was using a flat format ({ id, blockId, text, author }) that didn't match SuperDoc's expected export schema. SuperDoc's exportDocx() requires { commentId, creatorName, creatorEmail, createdTime, commentJSON } where commentJSON is an array of ProseMirror paragraph nodes.
Fix: The buildCommentEntry() helper in editApplicator.mjs now transforms comment data to the SuperDoc-compatible format. All 7 commentsStore.push() sites have been updated. See buildCommentEntry() for the transformation logic.
Issue: Editing Table of Contents blocks may fail with errors like:
[CommandService] Dispatch failed: Invalid content for node paragraph: <link(run(link(textStyle(trackDelete("X.")))))
Cause: TOC blocks have deeply nested link structures that ProseMirror cannot handle when track changes are applied.
Solution: Skip TOC blocks in edit files. The apply command now detects and warns about TOC-like blocks during validation. See CONTRACT-REVIEW-SKILL.md "TOC Block Limitations" for details.
npm testTests use Node.js built-in test runner (node:test):
| File | Tests | Description |
|---|---|---|
irExtractor.test.mjs |
IR extraction, block IDs | |
blockOperations.test.mjs |
Replace, delete, insert, comment, text-span operations | |
editApplicator.test.mjs |
Validation, application, sorting, text-span ops | |
documentReader.test.mjs |
Reading, formats, chunking | |
chunking.test.mjs |
Token estimation, chunking algorithm | |
editMerge.test.mjs |
Merging, conflicts, validation, composite conflict keys | |
markdownEditsParser.test.mjs |
Markdown ↔ JSON conversion, 4/6-column formats | |
multiAgent.test.mjs |
Multi-agent workflows | |
cli.test.mjs |
CLI command tests | |
integration.test.mjs |
End-to-end workflow tests |
| Package | Purpose |
|---|---|
@harbour-enterprises/superdoc |
Word document manipulation |
commander |
CLI argument parsing |
jsdom |
DOM environment for headless mode |
diff-match-patch |
Word-level diff computation |
This tool wouldn't exist without SuperDoc by Harbour — a truly exceptional document editing library that makes programmatic Word document manipulation not just possible, but elegant. Their headless mode is a game-changer for AI-driven workflows.
We also tip our hats to the giants SuperDoc stands on:
- Marijn Haverbeke and the community behind ProseMirror — the foundation that makes rich text editing on the web possible
- Tiptap and the many amazing editors of the web — from which we all draw inspiration
- These wonderful projects that SuperDoc uses: Yjs, FontAwesome, JSZip, and Vite
- diff-match-patch by Google — enabling our word-level diff magic
Apache License 2.0. This library depends on Superdoc, which is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).