superdoc-redlines

A Node.js CLI tool for applying tracked changes and comments to DOCX files using SuperDoc in headless mode.

Designed for use by AI agents (Claude, GPT, etc.) in IDE environments like Cursor, VS Code, or Claude Code.

Sped-up 30s demo of multi-agentic contract review of a 100+ page contract using Claude Code, this library and the CONTRACT-REVIEW-AGENTIC-SKILL.md workflow — click to see the full-length video.

For AI Agents: See SKILL.md for the concise task-oriented guide with decision flows, constraints, and expected outputs.

Example Skills

Features

Structured IR - Extract stable block IDs from any DOCX document
ID-Based Edits - Deterministic edits that don't depend on fragile text matching
Auto-Chunking - Handles documents of any size with token-aware chunking
Multi-Agent Support - Merge edits from parallel sub-agents with conflict resolution
Track Changes - Word-level diff produces minimal, reviewable changes
Comments - Attach comments to any block for review
Text-Span Annotations - Anchor comments and highlights to specific text within blocks (sub-block granularity)

Installation

git clone https://github.com/yuch85/superdoc-redlines
cd superdoc-redlines
npm install

Quick Start

# 1. Extract document structure (get block IDs)
node superdoc-redline.mjs extract --input contract.docx

# 2. Read document (for LLM analysis)
node superdoc-redline.mjs read --input contract.docx

# 3. Create edits.json referencing block IDs
# 4. Validate edits (optional but recommended)
node superdoc-redline.mjs validate --input contract.docx --edits edits.json

# 5. Apply edits with track changes
node superdoc-redline.mjs apply --input contract.docx --output redlined.docx --edits edits.json

CLI Commands

`extract`

Extract structured intermediate representation (IR) from a DOCX file.

node superdoc-redline.mjs extract --input doc.docx --output ir.json
node superdoc-redline.mjs extract -i doc.docx -o ir.json --max-text 100
node superdoc-redline.mjs extract -i doc.docx --no-defined-terms

Options:

Option	Description
`-i, --input <path>`	Input DOCX file (required)
`-o, --output <path>`	Output JSON file (default: `<input>-ir.json`)
`-f, --format <type>`	Output format: `full\|outline\|blocks` (default: `full`)
`--no-defined-terms`	Exclude defined terms extraction
`--max-text <length>`	Truncate block text to length

`read`

Read document for LLM consumption with automatic chunking.

node superdoc-redline.mjs read --input doc.docx
node superdoc-redline.mjs read -i doc.docx --chunk 1 --max-tokens 50000
node superdoc-redline.mjs read -i doc.docx --stats-only
node superdoc-redline.mjs read -i doc.docx -f outline

Options:

Option	Description
`-i, --input <path>`	Input DOCX file (required)
`-c, --chunk <index>`	Specific chunk index (0-indexed)
`--max-tokens <count>`	Max tokens per chunk (default: 100000)
`-f, --format <type>`	Output format: `full\|outline\|summary` (default: `full`)
`--stats-only`	Only show document statistics
`--no-metadata`	Exclude block IDs and positions

Output (JSON to stdout):

{
  "success": true,
  "totalChunks": 1,
  "currentChunk": 0,
  "hasMore": false,
  "nextChunkCommand": null,
  "document": {
    "metadata": { "filename": "doc.docx", "chunkIndex": 0, "totalChunks": 1 },
    "outline": [...],
    "blocks": [...],
    "idMapping": { "uuid-here": "b001" }
  }
}

`validate`

Validate edit instructions against a document.

node superdoc-redline.mjs validate --input doc.docx --edits edits.json

Options:

Option	Description
`-i, --input <path>`	Input DOCX file (required)
`-e, --edits <path>`	Edits JSON file (required)

Exit code: 0 if valid, 1 if issues found.

`apply`

Apply ID-based edits to a document with track changes.

node superdoc-redline.mjs apply --input doc.docx --output redlined.docx --edits edits.json
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --author-name "Reviewer"
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --no-track-changes
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --verbose  # Debug position mapping
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --strict   # Fail on truncation warnings

Options:

Option	Description
`-i, --input <path>`	Input DOCX file (required)
`-o, --output <path>`	Output DOCX file (required)
`-e, --edits <path>`	Edits JSON file (required)
`--author-name <name>`	Author name for track changes (default: `"AI Assistant"`)
`--author-email <email>`	Author email (default: `"ai@example.com"`)
`--no-track-changes`	Disable track changes mode
`--no-validate`	Skip validation before applying
`--no-sort`	Skip automatic edit sorting
`-v, --verbose`	Enable verbose logging for debugging position mapping
`--strict`	Treat truncation/corruption warnings as errors
`--skip-invalid`	Skip invalid edits instead of failing
`-q, --quiet-warnings`	Suppress content reduction warnings
`--allow-reduction`	Allow intentional content reduction without warnings (for jurisdiction conversions)

Content reduction: A replace edit where newText is significantly shorter than the original block. This can be intentional (simplification) or a sign of truncation/corruption. Do not allow reduction for normal proofreading or minor edits where text length should stay similar.

Validation:

The apply command automatically validates edits before applying, including:

Block ID existence checks
Required field validation (newText for replace, etc.)
newText truncation detection - Warns if newText appears truncated or corrupted
Content corruption detection - Detects patterns that suggest LLM output issues

Use --strict to fail the apply if any warnings are detected.

`merge`

Merge edit files from multiple sub-agents.

node superdoc-redline.mjs merge edits-a.json edits-b.json -o merged.json
node superdoc-redline.mjs merge *.json -o merged.json --conflict first
node superdoc-redline.mjs merge edits-*.json -o merged.json -v doc.docx

Options:

Option	Description
`<files...>`	Edit files to merge (required, positional)
`-o, --output <path>`	Output merged edits file (required)
`-c, --conflict <strategy>`	Conflict strategy: `error\|first\|last\|combine` (default: `error`)
`-n, --normalize`	Normalize field names from common variants (type→operation, etc.)
`-v, --validate <docx>`	Validate merged edits against document

Conflict Strategies:

error - Fail if any conflicts (safe default)
first - Keep first edit (by file order)
last - Keep last edit (by file order)
combine - Merge comments; use first for other operations

`parse-edits`

Convert markdown edits to JSON format. This enables a more resilient edit format that is easier for LLMs to generate without syntax errors.

node superdoc-redline.mjs parse-edits -i edits.md -o edits.json
node superdoc-redline.mjs parse-edits -i edits.md -o edits.json --validate doc.docx

Options:

Option	Description
`-i, --input <file>`	Input markdown file (.md) (required)
`-o, --output <file>`	Output JSON file (.json) (required)
`--validate <docx>`	Validate block IDs against document

`to-markdown`

Convert JSON edits to markdown format for human review or editing.

node superdoc-redline.mjs to-markdown -i edits.json -o edits.md

Options:

Option	Description
`-i, --input <file>`	Input JSON file (.json) (required)
`-o, --output <file>`	Output markdown file (.md) (required)

`find-block`

Find blocks by text content. Useful for locating block IDs when creating edits.

node superdoc-redline.mjs find-block --input contract.docx --text "VAT"
node superdoc-redline.mjs find-block --input contract-ir.json --regex "VAT|HMRC"
node superdoc-redline.mjs find-block -i contract.docx -t "Business Day" -c 100
node superdoc-redline.mjs find-block -i contract.docx -t "VAT" --limit 100
node superdoc-redline.mjs find-block -i contract.docx -t "VAT" --limit all

Options:

Option	Description
`-i, --input <path>`	Input DOCX file or IR JSON file (required)
`-t, --text <text>`	Text to search for (case-insensitive)
`-r, --regex <pattern>`	Regex pattern to search for
`-c, --context <chars>`	Context characters to show around match (default: 50)
`-l, --limit <n>`	Maximum results (default: 20, use "all" for unlimited)

Note: The default limit of 20 results may miss edits for high-frequency terms. For comprehensive discovery, use --limit all.

Example output:

Found 3 block(s):

[b051] (paragraph)
  ID: c9742b66-c68b-4a5b-92aa-b742b141425e
  Preview: ...Capital allowances923.Stamp duty and SDLT924.VAT925.Inheritance tax936.[Distraint by HMRC]93Schedu...

[b129] (heading)
  ID: 603fbb12-bb15-428c-985a-336afff6b4a5
  Preview: 4.VAT92

`recompress`

Recompress a DOCX file to reduce file size. SuperDoc writes uncompressed DOCX files (~6x larger than normal).

node superdoc-redline.mjs recompress --input bloated.docx
node superdoc-redline.mjs recompress -i bloated.docx -o clean.docx

Options:

Option	Description
`-i, --input <path>`	Input DOCX file (required)
`-o, --output <path>`	Output DOCX file (default: overwrites input)

Example output:

Recompressing: bloated.docx
  Original size: 2560.0 KB
  Compressed size: 384.0 KB
  Reduction: 85%
  Output: clean.docx

Note: Requires archiver and unzipper packages. Install with: npm install archiver unzipper

Edit Format (v0.3.0)

AI Agents: See SKILL.md for JSON Schema, common mistakes, and critical constraints.

Block-Level Operations (v0.2.0+)

{
  "version": "0.3.0",
  "author": {
    "name": "AI Counsel",
    "email": "ai@firm.com"
  },
  "edits": [
    {
      "blockId": "b001",
      "operation": "replace",
      "newText": "Modified clause text",
      "comment": "Optional comment",
      "diff": true
    },
    {
      "blockId": "b015",
      "operation": "delete",
      "comment": "Removing redundant clause"
    },
    {
      "blockId": "b020",
      "operation": "comment",
      "comment": "Needs legal review"
    },
    {
      "afterBlockId": "b025",
      "operation": "insert",
      "text": "New paragraph content",
      "type": "paragraph"
    }
  ]
}

Text-Span Operations (v0.3.0)

These operations anchor to specific text within a block using findText for sub-block granularity:

{
  "version": "0.3.0",
  "author": { "name": "AI Counsel", "email": "ai@firm.com" },
  "edits": [
    {
      "blockId": "b020",
      "operation": "comment",
      "findText": "Material Adverse Change",
      "comment": "Definition is too broad"
    },
    {
      "blockId": "b035",
      "operation": "insertAfterText",
      "findText": "reasonable endeavours",
      "insertText": " (acting in good faith)"
    },
    {
      "blockId": "b042",
      "operation": "highlight",
      "findText": "unlimited liability",
      "color": "#FF6B6B",
      "comment": "Flag for partner review"
    },
    {
      "blockId": "b050",
      "operation": "commentRange",
      "findText": "governing law shall be England",
      "comment": "Should this be Singapore?"
    },
    {
      "blockId": "b060",
      "operation": "commentHighlight",
      "findText": "entire agreement",
      "comment": "Check against side letter",
      "color": "#FFEB3B"
    }
  ]
}

Edit Operations

Operation	Required Fields	Optional Fields	Description
`replace`	`blockId`, `newText`	`comment`, `diff`	Replace block content
`delete`	`blockId`	`comment`	Delete block entirely
`comment`	`blockId`, `comment`	`findText`	Add comment to block (or to specific text if `findText` provided)
`insert`	`afterBlockId`, `text`	`type`, `level`, `comment`	Insert new block
`insertAfterText`	`blockId`, `findText`, `insertText`	`comment`	Insert text immediately after matched text within a block
`highlight`	`blockId`, `findText`	`color`, `comment`	Highlight specific text within a block
`commentRange`	`blockId`, `findText`, `comment`	-	Add comment anchored to specific text span
`commentHighlight`	`blockId`, `findText`, `comment`	`color`	Highlight text and attach a comment (atomic)

Field Reference

Field	Type	Description
`blockId`	string	Target block (use seqId like `"b001"`; UUIDs from extract are session-volatile and will fail in apply/validate)
`afterBlockId`	string	Insert position (use seqId like `"b001"`; UUIDs are not portable across CLI commands)
`newText`	string	Replacement text for `replace` operation
`text`	string	New block text for `insert` operation
`comment`	string	Comment text to attach
`diff`	boolean	Use word-level diff for minimal changes (default: `true`)
`type`	string	Block type for insert: `paragraph\|heading\|listItem`
`level`	number	Heading level for insert (1-6)
`findText`	string	Exact text to match within a block (for text-span operations)
`insertText`	string	Text to insert after matched text (`insertAfterText` only)
`color`	string	Highlight color as hex string (default: `"#FFEB3B"`)

Text-Span Behavior

findText matching uses exact string matching within the block's text content
comment with findText gracefully falls back to a full-block comment if the text is not found
commentRange also falls back to full-block comment on match failure
Sort order: When multiple text-span operations target the same block, they are applied rightmost-first to prevent position corruption
Conflict detection: Text-span operations use composite keys (blockId::operation::findText) so multiple annotations on different text spans within the same block do not conflict

Markdown Edit Format (Recommended for LLMs)

For large edit sets, the markdown format is more reliable than JSON because:

No syntax errors from missing commas or quotes
Partial output is still parseable (truncation recovery)
Lower cognitive load during generation
Human-readable for review

Markdown Format Specification

The markdown format supports two table layouts: a 4-column format for basic operations and a 6-column extended format when text-span operations are used.

4-Column Format (block-level operations only):

# Edits: [Document Name]

## Metadata
- **Version**: 0.3.0
- **Author Name**: AI Legal Counsel
- **Author Email**: ai@counsel.sg

## Edits Table

| Block | Op | Diff | Comment |
|-------|-----|------|---------|
| b257 | delete | - | DELETE TULRCA definition |
| b165 | replace | true | Change Business Day to Singapore |
| b500 | comment | - | Review needed |
| b449 | insert | - | Insert new clause |

## Replacement Text

### b165 newText
Business Day: a day other than a Saturday, Sunday or public holiday in Singapore when banks in Singapore are open for business.

### b449 insertText
The Buyer shall offer employment to each Transferring Employee.

6-Column Format (when using text-span operations):

## Edits Table

| Block | Op | FindText | Color | Diff | Comment |
|-------|-----|----------|-------|------|---------|
| b165 | replace | - | - | true | Change Business Day |
| b020 | comment | Material Adverse Change | - | - | Definition too broad |
| b035 | insertAfterText | reasonable endeavours | - | - | (acting in good faith) |
| b042 | highlight | unlimited liability | #FF6B6B | - | Flag for review |
| b050 | commentRange | governing law | - | - | Should be Singapore? |
| b060 | commentHighlight | entire agreement | #FFEB3B | - | Check side letter |

The format is auto-detected: if any edit uses findText, color, or a text-span operation, the 6-column format is used. For insertAfterText, the Comment column contains the text to insert.

Table Columns

4-Column Format:

Column	Required	Values	Description
Block	Yes	`b###`	Block ID from document IR
Op	Yes	`delete`, `replace`, `comment`, `insert`	Operation type
Diff	For replace	`true`, `false`, `-`	Word-level diff mode
Comment	No	Free text	Rationale for edit

6-Column Format (extends 4-column):

Column	Required	Values	Description
Block	Yes	`b###`	Block ID from document IR
Op	Yes	All 8 operation types	Operation type
FindText	For text-span ops	Exact text string, `-`	Text to match within block
Color	For highlight ops	Hex color string, `-`	Highlight color
Diff	For replace	`true`, `false`, `-`	Word-level diff mode
Comment	No	Free text	Rationale (or insertText for `insertAfterText`)

Text Sections

### b### newText - Replacement text for replace operations
### b### insertText - New content for insert operations

Usage

# Convert markdown to JSON
node superdoc-redline.mjs parse-edits -i edits.md -o edits.json

# Apply directly from markdown (auto-detects format)
node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.md

# Convert existing JSON to markdown for review
node superdoc-redline.mjs to-markdown -i edits.json -o edits.md

Track Changes

Track changes is enabled by default. All edits appear as native Word revisions when you open the output file in Microsoft Word:

Insertions - Shown as underlined additions
Deletions - Shown as strikethrough removals
Changes attributed to the author you specify (default: "AI Assistant")

Default Behavior

# Track changes ON by default
node superdoc-redline.mjs apply -i doc.docx -o redlined.docx -e edits.json

Custom Author Attribution

node superdoc-redline.mjs apply -i doc.docx -o redlined.docx -e edits.json \
  --author-name "Legal Review Bot" \
  --author-email "review@firm.com"

Disable Track Changes

For direct edits without revision marks:

node superdoc-redline.mjs apply -i doc.docx -o out.docx -e edits.json --no-track-changes

Word-Level Diff

By default, replace operations use word-level diff to produce minimal tracked changes. Only the words that actually changed are marked as insertions/deletions, not the entire block.

To replace the entire block content (useful for complete rewrites):

{
  "blockId": "b025",
  "operation": "replace",
  "newText": "Completely new text here",
  "diff": false
}

IR Format

The intermediate representation (IR) extracted from a document:

{
  "metadata": {
    "version": "0.2.0",
    "filename": "contract.docx",
    "format": "full",
    "extractedAt": "2026-02-04T10:00:00.000Z"
  },
  "outline": [
    {
      "title": "1. Definitions",
      "level": 1,
      "blockId": "uuid-here",
      "seqId": "b001",
      "children": []
    }
  ],
  "blocks": [
    {
      "id": "uuid-here",
      "seqId": "b001",
      "type": "heading",
      "level": 1,
      "text": "1. Definitions",
      "startPos": 0,
      "endPos": 15
    }
  ],
  "idMapping": {
    "uuid-here": "b001"
  },
  "definedTerms": {
    "Agreement": { "blockId": "b001", "text": "..." }
  }
}

Dual ID System

Each block has two IDs:

ID Type	Format	Purpose
UUID	`550e8400-e29b-41d4-a716-446655440000`	SuperDoc internal ID, regenerated on each document load — not portable across CLI commands
seqId	`b001`, `b002`, `b003`	Stable sequential ID for LLM reference — use this in edit files

Always use seqId in edit files. UUIDs are session-volatile — they are regenerated on each document load and will fail when used across separate CLI invocations (e.g., extract → apply). UUIDs are accepted for backward compatibility within a single programmatic session, but this is deprecated and will emit a warning.

Chunking

Large documents are automatically split into chunks for LLM context limits.

How It Works

Token estimation: ~4 characters per token
Default chunk size: 100,000 tokens
Chunks break at heading boundaries when possible
Every chunk includes the full document outline for context

Reading Large Documents

# Check document size
node superdoc-redline.mjs read --input large.docx --stats-only

# Read first chunk (includes navigation info)
node superdoc-redline.mjs read --input large.docx --chunk 0

# Output includes nextChunkCommand for sequential reading

Chunk Output Structure

{
  "success": true,
  "totalChunks": 3,
  "currentChunk": 0,
  "hasMore": true,
  "nextChunkCommand": "node superdoc-redline.mjs read --input \"large.docx\" --chunk 1",
  "document": {
    "metadata": { "blockRange": { "start": "b001", "end": "b100" } },
    "outline": [...],
    "blocks": [...]
  }
}

Multi-Agent Workflow

For parallel review by multiple agents:

# 1. Extract IR (shared by all agents)
node superdoc-redline.mjs extract -i contract.docx -o contract-ir.json

# 2. Sub-agents produce edit files
# Agent A -> edits-definitions.json
# Agent B -> edits-warranties.json
# Agent C -> edits-govlaw.json

# 3. Merge all edits (use --normalize if sub-agents use inconsistent field names)
node superdoc-redline.mjs merge \
  edits-definitions.json \
  edits-warranties.json \
  edits-govlaw.json \
  -o merged-edits.json \
  -c error \
  --normalize \
  -v contract.docx

# 4. Apply merged edits (use --skip-invalid to continue past bad edits)
node superdoc-redline.mjs apply \
  -i contract.docx \
  -o redlined.docx \
  -e merged-edits.json \
  --skip-invalid

Block Range Assignment Best Practices

⚠️ Important: Don't assign sequential block ranges (b001-b300, b301-b600, etc.) without considering clause type distribution.

Legal documents have clause types scattered throughout - governing law clauses may appear in definitions, main provisions, and schedules. Sequential assignment will miss edits when agents don't have all blocks for their assigned clause types.

Recommended approach:

During discovery, map clause types to actual block locations
Assign agents by clause type grouping, not sequential ranges
Include overlap buffer zones for ambiguous boundaries

See CONTRACT-REVIEW-AGENTIC-SKILL.md for detailed guidance on multi-agent orchestration.

Module API

For programmatic use:

IR Extraction

import { extractDocumentIR, createEditorWithIR } from './src/irExtractor.mjs';

// Extract IR from file
const ir = await extractDocumentIR('contract.docx');
console.log(`${ir.blocks.length} blocks`);

// Get IR with editor for block operations
const { editor, ir, cleanup } = await createEditorWithIR('contract.docx');
// ... use editor and ir ...
cleanup();

Block Operations

import {
  replaceBlockById,
  deleteBlockById,
  insertAfterBlock,
  addCommentToBlock,
  findTextPositionInBlock,
  insertTextAfterMatch,
  highlightTextInBlock,
  addCommentToTextInBlock
} from './src/blockOperations.mjs';

// With word-level diff
await replaceBlockById(editor, 'b001', 'New text');

// Delete a block
await deleteBlockById(editor, 'b005');

// Insert after a block
await insertAfterBlock(editor, 'b010', 'New paragraph');

// Add comment to whole block
await addCommentToBlock(editor, 'b015', 'Needs review');

Text-Span Operations (v0.3.0)

// Find text position within a block
const pos = findTextPositionInBlock(editor, 'b020', 'Material Adverse Change');
// => { found: true, from: 142, to: 166 }

// Insert text after a match
await insertTextAfterMatch(editor, 'b035', 'reasonable endeavours', ' (acting in good faith)');

// Highlight specific text (default yellow)
await highlightTextInBlock(editor, 'b042', 'unlimited liability');

// Highlight with custom color
await highlightTextInBlock(editor, 'b042', 'unlimited liability', '#FF6B6B');

// Add comment anchored to specific text
await addCommentToTextInBlock(editor, 'b050', 'governing law', 'Should this be Singapore?');

Edit Application

import { applyEdits, validateEdits } from './src/editApplicator.mjs';

// Validate first
const validation = await validateEdits('contract.docx', editConfig);
if (!validation.valid) {
  console.error(validation.issues);
}

// Apply edits
const result = await applyEdits('contract.docx', 'redlined.docx', editConfig);
console.log(`Applied: ${result.applied}, Skipped: ${result.skipped.length}`);

Document Reading

import { readDocument, getDocumentStats } from './src/documentReader.mjs';

// Get stats
const stats = await getDocumentStats('contract.docx');
console.log(`Blocks: ${stats.blockCount}, Tokens: ${stats.estimatedTokens}`);

// Read with chunking
const result = await readDocument('contract.docx', { chunkIndex: 0 });

Edit Merging

import { mergeEditFiles, normalizeEdit, splitBlocksForAgents } from './src/editMerge.mjs';

// Split work for sub-agents
const ranges = splitBlocksForAgents(ir, 3);

// Merge results (with field name normalization)
const result = await mergeEditFiles(['a.json', 'b.json'], {
  conflictStrategy: 'error',
  normalize: true  // Fixes type→operation, replacement→newText, etc.
});

// Normalize a single edit manually
const normalized = normalizeEdit({
  blockId: 'b001',
  type: 'replace',         // Will become 'operation'
  replacement: 'text',     // Will become 'newText'
  search: 'some text',     // Will become 'findText'
  highlightColor: '#FF0'   // Will become 'color'
});

Breaking Changes from v1.x

Version Bump

Package version changed from 1.x to 0.2.0 to signal breaking API changes.

Removed Features

v1.x Feature	Status	Migration
Text-based `find`/`replace` edits	REMOVED	Use ID-based `blockId` edits
`--inline` JSON argument	REMOVED	Use `--edits` file path
`--config` option	REMOVED	Use subcommand + `--edits`
Fuzzy text matching for edits	INTERNAL	Now used only for IR extraction
Clause targeting by text	REMOVED	Use `blockId` from extracted IR

New CLI Structure

# OLD (v1.x) - Text-based
node superdoc-redline.mjs --config edits.json
node superdoc-redline.mjs --inline '{"input":"doc.docx","edits":[...]}'

# NEW (v0.2.0) - Subcommands
node superdoc-redline.mjs extract --input doc.docx
node superdoc-redline.mjs read --input doc.docx
node superdoc-redline.mjs validate --input doc.docx --edits edits.json
node superdoc-redline.mjs apply --input doc.docx --output out.docx --edits edits.json
node superdoc-redline.mjs merge edits1.json edits2.json --output merged.json

Migration Path

Extract IR to get block IDs: extract --input doc.docx
Rewrite edits to use blockId instead of find
Use apply subcommand instead of root command

Known Issues and Workarounds

Output File Size (Uncompressed DOCX)

Issue: Output DOCX files are ~6x larger than expected (~2.5MB instead of ~400KB).

Cause: The JSZip library uses ZIP_STORED (no compression) by default.

Solution: Use the recompress command:

node superdoc-redline.mjs recompress --input bloated.docx --output clean.docx

Alternatively, see CONTRACT-REVIEW-SKILL.md "Step 5: Recompress Output File" for a Python script.

recommendedChunks Calculation

Fixed in v0.2.0: The --stats-only output now respects --max-tokens and provides multiple recommendations:

node superdoc-redline.mjs read --stats-only --input doc.docx --max-tokens 10000

Output includes:

recommendedChunks - Based on your specified --max-tokens (or default 100k)
recommendedChunksByLimit - Recommendations for common limits (10k, 25k, 40k, 100k)
maxTokensUsed - The limit used for the primary calculation

Track Changes IR Extraction

Issue: When extracting IR from an amended document (with track changes), both deleted and inserted text appear concatenated.

Cause: Track changes preserves deleted text in the document structure; IR extraction captures all text content.

This is expected behavior. To get only the final text, open the DOCX in Word, accept all changes, save, then re-extract.

UUID Block IDs Not Portable Across CLI Commands

Issue: Using a UUID from extract output as a blockId in an edit file causes "Block not found" errors during apply or validate.

Cause: SuperDoc regenerates UUIDs on each document load. The UUID from one extract session will not exist in a later apply session.

Solution: Always use seqId values (e.g., b001, b025) in edit files. SeqIds are derived from document order and are stable across CLI invocations.

Comment Bubbles Not Appearing in Output DOCX (Fixed)

Issue: Comment edits (comment, commentRange, commentHighlight) were applied successfully (correct count reported) but the output DOCX had empty comment bodies and missing commentRangeStart/commentRangeEnd anchors — so no visible comment bubbles appeared in Word or LibreOffice.

Cause: The internal commentsStore was using a flat format ({ id, blockId, text, author }) that didn't match SuperDoc's expected export schema. SuperDoc's exportDocx() requires { commentId, creatorName, creatorEmail, createdTime, commentJSON } where commentJSON is an array of ProseMirror paragraph nodes.

Fix: The buildCommentEntry() helper in editApplicator.mjs now transforms comment data to the SuperDoc-compatible format. All 7 commentsStore.push() sites have been updated. See buildCommentEntry() for the transformation logic.

TOC Block Editing Failures

Issue: Editing Table of Contents blocks may fail with errors like:

[CommandService] Dispatch failed: Invalid content for node paragraph: <link(run(link(textStyle(trackDelete("X.")))))

Cause: TOC blocks have deeply nested link structures that ProseMirror cannot handle when track changes are applied.

Solution: Skip TOC blocks in edit files. The apply command now detects and warns about TOC-like blocks during validation. See CONTRACT-REVIEW-SKILL.md "TOC Block Limitations" for details.

Testing

npm test

Tests use Node.js built-in test runner (node:test):

File	Tests	Description
`irExtractor.test.mjs`	IR extraction, block IDs
`blockOperations.test.mjs`	Replace, delete, insert, comment, text-span operations
`editApplicator.test.mjs`	Validation, application, sorting, text-span ops
`documentReader.test.mjs`	Reading, formats, chunking
`chunking.test.mjs`	Token estimation, chunking algorithm
`editMerge.test.mjs`	Merging, conflicts, validation, composite conflict keys
`markdownEditsParser.test.mjs`	Markdown ↔ JSON conversion, 4/6-column formats
`multiAgent.test.mjs`	Multi-agent workflows
`cli.test.mjs`	CLI command tests
`integration.test.mjs`	End-to-end workflow tests

Dependencies

Package	Purpose
`@harbour-enterprises/superdoc`	Word document manipulation
`commander`	CLI argument parsing
`jsdom`	DOM environment for headless mode
`diff-match-patch`	Word-level diff computation

Shout-outs

This tool wouldn't exist without SuperDoc by Harbour — a truly exceptional document editing library that makes programmatic Word document manipulation not just possible, but elegant. Their headless mode is a game-changer for AI-driven workflows.

We also tip our hats to the giants SuperDoc stands on:

Marijn Haverbeke and the community behind ProseMirror — the foundation that makes rich text editing on the web possible
Tiptap and the many amazing editors of the web — from which we all draw inspiration
These wonderful projects that SuperDoc uses: Yjs, FontAwesome, JSZip, and Vite
diff-match-patch by Google — enabling our word-level diff magic

License

Apache License 2.0. This library depends on Superdoc, which is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.claude/skills/claude-review		.claude/skills/claude-review
skills		skills
src		src
tests_and_others		tests_and_others
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
metadata.yml		metadata.yml
package-lock.json		package-lock.json
package.json		package.json
superdoc-redline.mjs		superdoc-redline.mjs

License

yuch85/superdoc-redlines

Folders and files

Latest commit

History

Repository files navigation

superdoc-redlines

Example Skills

Features

Installation

Quick Start

CLI Commands

extract

read

validate

apply

merge

parse-edits

to-markdown

find-block

recompress

Edit Format (v0.3.0)

Block-Level Operations (v0.2.0+)

Text-Span Operations (v0.3.0)

Edit Operations

Field Reference

Text-Span Behavior

Markdown Edit Format (Recommended for LLMs)

Markdown Format Specification

Table Columns

Text Sections

Usage

Track Changes

Default Behavior

Custom Author Attribution

Disable Track Changes

Word-Level Diff

IR Format

Dual ID System

Chunking

How It Works

Reading Large Documents

Chunk Output Structure

Multi-Agent Workflow

Block Range Assignment Best Practices

Module API

IR Extraction

Block Operations

Text-Span Operations (v0.3.0)

Edit Application

Document Reading

Edit Merging

Breaking Changes from v1.x

Version Bump

Removed Features

New CLI Structure

Migration Path

Known Issues and Workarounds

Output File Size (Uncompressed DOCX)

recommendedChunks Calculation

Track Changes IR Extraction

UUID Block IDs Not Portable Across CLI Commands

Comment Bubbles Not Appearing in Output DOCX (Fixed)

TOC Block Editing Failures

Testing

Dependencies

Shout-outs

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`extract`

`read`

`validate`

`apply`

`merge`

`parse-edits`

`to-markdown`

`find-block`

`recompress`

Packages