Skip to content

fix(core): preserve Maps and buffers when cloning a document for edits#990

Merged
jedrazb merged 1 commit into
eigenpal:mainfrom
aldrinjenson:fix/clone-document-preserves-buffer-and-media
Jun 23, 2026
Merged

fix(core): preserve Maps and buffers when cloning a document for edits#990
jedrazb merged 1 commit into
eigenpal:mainfrom
aldrinjenson:fix/clone-document-preserves-buffer-and-media

Conversation

@aldrinjenson

@aldrinjenson aldrinjenson commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Closes #992

Summary

cloneDocument runs on every agent edit (executeCommandcloneDocument) and used JSON.parse(JSON.stringify(doc)). JSON can't represent several fields the document carries, so the clone silently corrupts them:

  • package.headers / package.footers / package.media are Maps → become {}
  • originalBuffer and each MediaFile.data are ArrayBuffers → become {}
  • package.properties.created / modified are Dates → become strings

The damage is load-bearing: a no-edit export works, but after the first edit export breaks and images vanish.

Repro

import { DocumentAgent } from '@eigenpal/docx-editor-core/headless';

const agent = await DocumentAgent.fromBuffer(bytes);   // any real .docx
await agent.toBuffer();                                 // ✅ works (no edit)

const edited = agent.insertText({ paragraphIndex: 0, offset: 0 }, 'Hello ');
await edited.toBuffer();                                 // ❌ throws

Depending on the document, the throw is either:

  • Can't read the data of 'the loaded zip file'repackDocxJSZip.loadAsync(originalBuffer) on the {} left where the ArrayBuffer was, or
  • map.entries is not a functionrepackDocx's collectParts iterating package.headers / footers, now {} instead of Maps.

Any document with images also loses them on the first edit (the media Map is gone).

This blocks the headless generate → edit → export workflow entirely (e.g. server-side document generation that never opens the editor).

Fix

Clone with structuredClone, which handles Map, Date, and ArrayBuffer correctly. Two potentially large binary payloads are special-cased so they aren't deep-copied on every edit:

  • originalBuffer — the entire source .docx; read-only on export → shared.
  • package.media — its MediaFile entries are immutable → shallow-copied Map (no image bytes copied; per-clone additions stay isolated).

Testing

  • New cloneDocument.test.ts — asserts originalBuffer (bytes), media (Map + binary), header/footer Maps, structural deep-clone, and media-copy isolation all survive.
  • New editExportRoundtrip.test.tsfromBuffer → insertText/insertTable → toBuffer now succeeds and yields a valid .docx (PK zip); this threw before.
  • Full core suite green locally: 1459 pass, 0 fail. typecheck + lint clean.

Notes

  • Pure internal change to cloneDocument — no public API surface change.
  • A changeset is included (patch, @eigenpal/docx-editor-core).

View with Codesmith Autofix with Codesmith
Need help on this PR? Tag /codesmith with what you need. Autofix is disabled.

cloneDocument runs on every agent edit and used JSON.parse(JSON.stringify()),
which silently drops values JSON can't represent. The document holds several:
package.headers/footers/media are Maps (-> {}), originalBuffer and each
MediaFile.data are ArrayBuffers (-> {}), and properties.created/modified are
Dates (-> strings).

So after the first edit, export broke: repackDocx's collectParts threw
'map.entries is not a function' on the dead headers/footers, or JSZip.loadAsync
threw "Can't read the data of 'the loaded zip file'" on the dead
originalBuffer, and every image was dropped. A no-edit export still worked,
which masked the bug.

Clone with structuredClone, which handles Maps, Dates, and ArrayBuffers.
originalBuffer (the whole source .docx, read-only on export) is shared and
package.media is shallow-copied so the two large binary payloads aren't copied
on every edit; media's MediaFile entries are immutable, so sharing them copies
no image bytes while still isolating per-clone additions.

Adds unit tests for cloneDocument and an edit->toBuffer round-trip regression test.
@vercel

vercel Bot commented Jun 22, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docx-editor Ready Ready Preview, Comment Jun 22, 2026 7:45pm

Request Review

@eigenpal-release-pal

Copy link
Copy Markdown
Contributor

All contributors have signed the CLA ✍️ ✅

Posted by the CLA bot.

@greptile-apps

greptile-apps Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a critical bug in cloneDocument where JSON.parse(JSON.stringify(doc)) silently destroyed non-JSON-serializable fields (Maps, ArrayBuffers, Dates) on every agent edit, making any post-edit export throw or drop images. The fix switches to structuredClone and special-cases the two large binary payloads (originalBuffer shared, media shallow-copied) to avoid copying megabytes of zip/image data on each edit.

  • helpers.ts: Single-function change — structuredClone with originalBuffer shared and media shallow-copied as a new Map; headers/footers/relationships/Date fields are now correctly deep-cloned.
  • Tests: Two new test files cover field-level preservation (cloneDocument.test.ts) and the full fromBuffer → edit → toBuffer round-trip (editExportRoundtrip.test.ts), directly targeting the previously broken paths.
  • Changeset: Correct package name and patch bump; summary wording is overly verbose for the CHANGELOG (see inline suggestion).

Confidence Score: 4/5

Safe to merge; the core change is a well-understood one-function fix backed by targeted regression tests.

The implementation is correct and the test coverage directly exercises the broken path. The only finding is a changeset summary that is too long and leads with internals rather than the user-visible outcome, which is a minor style issue that does not affect correctness.

.changeset/clone-document-preserves-buffer-and-media.md — summary should be trimmed to two lines per project convention.

Important Files Changed

Filename Overview
packages/core/src/agent/executor/helpers.ts Replaces JSON.parse/stringify clone with structuredClone, sharing originalBuffer and shallow-copying the media Map for performance; correct fix for the described corruption bug
packages/core/src/agent/executor/tests/cloneDocument.test.ts New unit tests covering Maps, ArrayBuffers, isolation of the shallow-copied media Map, and structural deep-clone; test cases are well-targeted at the exact fields that were broken
packages/core/src/agent/tests/editExportRoundtrip.test.ts New regression tests confirming fromBuffer → edit → toBuffer succeeds and returns a valid ZIP; validates the previously-broken end-to-end path
.changeset/clone-document-preserves-buffer-and-media.md Changeset summary is correct type and package name, but exceeds the project's two-line length limit and leads with technical internals rather than user-visible outcome

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Agent
    participant cloneDocument
    participant structuredClone
    participant Export as toBuffer / repackDocx

    Agent->>cloneDocument: doc (with originalBuffer, media Map, headers Map)
    Note over cloneDocument: Extract originalBuffer + media<br/>before cloning
    cloneDocument->>structuredClone: "doc minus originalBuffer & media"
    structuredClone-->>cloneDocument: deep clone (Maps, Dates, ArrayBuffers preserved)
    Note over cloneDocument: Re-attach originalBuffer (shared ref)<br/>Re-attach media (new Map, shared MediaFile entries)
    cloneDocument-->>Agent: cloned Document
    Agent->>Export: cloned Document (edit applied)
    Export->>Export: JSZip.loadAsync(originalBuffer) ✅
    Export->>Export: headers/footers.entries() ✅
    Export->>Export: media.get(path).data ✅
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Agent
    participant cloneDocument
    participant structuredClone
    participant Export as toBuffer / repackDocx

    Agent->>cloneDocument: doc (with originalBuffer, media Map, headers Map)
    Note over cloneDocument: Extract originalBuffer + media<br/>before cloning
    cloneDocument->>structuredClone: "doc minus originalBuffer & media"
    structuredClone-->>cloneDocument: deep clone (Maps, Dates, ArrayBuffers preserved)
    Note over cloneDocument: Re-attach originalBuffer (shared ref)<br/>Re-attach media (new Map, shared MediaFile entries)
    cloneDocument-->>Agent: cloned Document
    Agent->>Export: cloned Document (edit applied)
    Export->>Export: JSZip.loadAsync(originalBuffer) ✅
    Export->>Export: headers/footers.entries() ✅
    Export->>Export: media.get(path).data ✅
Loading

Reviews (1): Last reviewed commit: "fix(core): preserve Maps and buffers whe..." | Re-trigger Greptile

'@eigenpal/docx-editor-core': patch
---

Fix headless agent edits corrupting the document. `cloneDocument` (run on every agent edit) used `JSON.parse(JSON.stringify())`, which silently dropped values JSON can't represent: the `headers`/`footers`/`media` `Map`s became `{}` and `originalBuffer` became `{}`. As a result, the first edit broke export — `repackDocx` threw `Can't read the data of 'the loaded zip file'` (dead `originalBuffer`) or `map.entries is not a function` (dead headers/footers) — and dropped every image. Clone with `structuredClone` instead, sharing the read-only `originalBuffer` and shallow-copying the immutable `media` map so large binary payloads aren't copied on every edit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The changeset summary is far longer than the project's two-line limit (CLAUDE.md: "Keep it concise (one or two lines), lead with the user-visible change (what changed, not how)"). The current text runs five-plus sentences and front-loads the internal mechanics (JSON.parse/stringify, stack traces) rather than the consumer-visible outcome. It also doesn't include a Fixes #N reference despite the PR describing a clear regression.

Suggested change
Fix headless agent edits corrupting the document. `cloneDocument` (run on every agent edit) used `JSON.parse(JSON.stringify())`, which silently dropped values JSON can't represent: the `headers`/`footers`/`media` `Map`s became `{}` and `originalBuffer` became `{}`. As a result, the first edit broke export — `repackDocx` threw `Can't read the data of 'the loaded zip file'` (dead `originalBuffer`) or `map.entries is not a function` (dead headers/footers) — and dropped every image. Clone with `structuredClone` instead, sharing the read-only `originalBuffer` and shallow-copying the immutable `media` map so large binary payloads aren't copied on every edit.
Fix headless agent export: the first edit no longer corrupts the document or drops images. `cloneDocument` now uses `structuredClone` so Maps and ArrayBuffers survive each edit.

Context Used: CLAUDE.md (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@jedrazb jedrazb merged commit 00c015b into eigenpal:main Jun 23, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Headless agent edits corrupt the document: cloneDocument JSON-clones away Maps/ArrayBuffers, breaking export + dropping images

2 participants