fix(core): preserve Maps and buffers when cloning a document for edits#990
Conversation
cloneDocument runs on every agent edit and used JSON.parse(JSON.stringify()),
which silently drops values JSON can't represent. The document holds several:
package.headers/footers/media are Maps (-> {}), originalBuffer and each
MediaFile.data are ArrayBuffers (-> {}), and properties.created/modified are
Dates (-> strings).
So after the first edit, export broke: repackDocx's collectParts threw
'map.entries is not a function' on the dead headers/footers, or JSZip.loadAsync
threw "Can't read the data of 'the loaded zip file'" on the dead
originalBuffer, and every image was dropped. A no-edit export still worked,
which masked the bug.
Clone with structuredClone, which handles Maps, Dates, and ArrayBuffers.
originalBuffer (the whole source .docx, read-only on export) is shared and
package.media is shallow-copied so the two large binary payloads aren't copied
on every edit; media's MediaFile entries are immutable, so sharing them copies
no image bytes while still isolating per-clone additions.
Adds unit tests for cloneDocument and an edit->toBuffer round-trip regression test.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
All contributors have signed the CLA ✍️ ✅ Posted by the CLA bot. |
Greptile SummaryThis PR fixes a critical bug in
Confidence Score: 4/5Safe to merge; the core change is a well-understood one-function fix backed by targeted regression tests. The implementation is correct and the test coverage directly exercises the broken path. The only finding is a changeset summary that is too long and leads with internals rather than the user-visible outcome, which is a minor style issue that does not affect correctness. .changeset/clone-document-preserves-buffer-and-media.md — summary should be trimmed to two lines per project convention.
|
| Filename | Overview |
|---|---|
| packages/core/src/agent/executor/helpers.ts | Replaces JSON.parse/stringify clone with structuredClone, sharing originalBuffer and shallow-copying the media Map for performance; correct fix for the described corruption bug |
| packages/core/src/agent/executor/tests/cloneDocument.test.ts | New unit tests covering Maps, ArrayBuffers, isolation of the shallow-copied media Map, and structural deep-clone; test cases are well-targeted at the exact fields that were broken |
| packages/core/src/agent/tests/editExportRoundtrip.test.ts | New regression tests confirming fromBuffer → edit → toBuffer succeeds and returns a valid ZIP; validates the previously-broken end-to-end path |
| .changeset/clone-document-preserves-buffer-and-media.md | Changeset summary is correct type and package name, but exceeds the project's two-line length limit and leads with technical internals rather than user-visible outcome |
Sequence Diagram
%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Agent
participant cloneDocument
participant structuredClone
participant Export as toBuffer / repackDocx
Agent->>cloneDocument: doc (with originalBuffer, media Map, headers Map)
Note over cloneDocument: Extract originalBuffer + media<br/>before cloning
cloneDocument->>structuredClone: "doc minus originalBuffer & media"
structuredClone-->>cloneDocument: deep clone (Maps, Dates, ArrayBuffers preserved)
Note over cloneDocument: Re-attach originalBuffer (shared ref)<br/>Re-attach media (new Map, shared MediaFile entries)
cloneDocument-->>Agent: cloned Document
Agent->>Export: cloned Document (edit applied)
Export->>Export: JSZip.loadAsync(originalBuffer) ✅
Export->>Export: headers/footers.entries() ✅
Export->>Export: media.get(path).data ✅
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Agent
participant cloneDocument
participant structuredClone
participant Export as toBuffer / repackDocx
Agent->>cloneDocument: doc (with originalBuffer, media Map, headers Map)
Note over cloneDocument: Extract originalBuffer + media<br/>before cloning
cloneDocument->>structuredClone: "doc minus originalBuffer & media"
structuredClone-->>cloneDocument: deep clone (Maps, Dates, ArrayBuffers preserved)
Note over cloneDocument: Re-attach originalBuffer (shared ref)<br/>Re-attach media (new Map, shared MediaFile entries)
cloneDocument-->>Agent: cloned Document
Agent->>Export: cloned Document (edit applied)
Export->>Export: JSZip.loadAsync(originalBuffer) ✅
Export->>Export: headers/footers.entries() ✅
Export->>Export: media.get(path).data ✅
Reviews (1): Last reviewed commit: "fix(core): preserve Maps and buffers whe..." | Re-trigger Greptile
| '@eigenpal/docx-editor-core': patch | ||
| --- | ||
|
|
||
| Fix headless agent edits corrupting the document. `cloneDocument` (run on every agent edit) used `JSON.parse(JSON.stringify())`, which silently dropped values JSON can't represent: the `headers`/`footers`/`media` `Map`s became `{}` and `originalBuffer` became `{}`. As a result, the first edit broke export — `repackDocx` threw `Can't read the data of 'the loaded zip file'` (dead `originalBuffer`) or `map.entries is not a function` (dead headers/footers) — and dropped every image. Clone with `structuredClone` instead, sharing the read-only `originalBuffer` and shallow-copying the immutable `media` map so large binary payloads aren't copied on every edit. |
There was a problem hiding this comment.
The changeset summary is far longer than the project's two-line limit (CLAUDE.md: "Keep it concise (one or two lines), lead with the user-visible change (what changed, not how)"). The current text runs five-plus sentences and front-loads the internal mechanics (
JSON.parse/stringify, stack traces) rather than the consumer-visible outcome. It also doesn't include a Fixes #N reference despite the PR describing a clear regression.
| Fix headless agent edits corrupting the document. `cloneDocument` (run on every agent edit) used `JSON.parse(JSON.stringify())`, which silently dropped values JSON can't represent: the `headers`/`footers`/`media` `Map`s became `{}` and `originalBuffer` became `{}`. As a result, the first edit broke export — `repackDocx` threw `Can't read the data of 'the loaded zip file'` (dead `originalBuffer`) or `map.entries is not a function` (dead headers/footers) — and dropped every image. Clone with `structuredClone` instead, sharing the read-only `originalBuffer` and shallow-copying the immutable `media` map so large binary payloads aren't copied on every edit. | |
| Fix headless agent export: the first edit no longer corrupts the document or drops images. `cloneDocument` now uses `structuredClone` so Maps and ArrayBuffers survive each edit. |
Context Used: CLAUDE.md (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Closes #992
Summary
cloneDocumentruns on every agent edit (executeCommand→cloneDocument) and usedJSON.parse(JSON.stringify(doc)). JSON can't represent several fields the document carries, so the clone silently corrupts them:package.headers/package.footers/package.mediaareMaps → become{}originalBufferand eachMediaFile.dataareArrayBuffers → become{}package.properties.created/modifiedareDates → become stringsThe damage is load-bearing: a no-edit export works, but after the first edit export breaks and images vanish.
Repro
Depending on the document, the throw is either:
Can't read the data of 'the loaded zip file'—repackDocx→JSZip.loadAsync(originalBuffer)on the{}left where theArrayBufferwas, ormap.entries is not a function—repackDocx'scollectPartsiteratingpackage.headers/footers, now{}instead ofMaps.Any document with images also loses them on the first edit (the
mediaMapis gone).This blocks the headless generate → edit → export workflow entirely (e.g. server-side document generation that never opens the editor).
Fix
Clone with
structuredClone, which handlesMap,Date, andArrayBuffercorrectly. Two potentially large binary payloads are special-cased so they aren't deep-copied on every edit:originalBuffer— the entire source.docx; read-only on export → shared.package.media— itsMediaFileentries are immutable → shallow-copiedMap(no image bytes copied; per-clone additions stay isolated).Testing
cloneDocument.test.ts— assertsoriginalBuffer(bytes),media(Map+ binary), header/footer Maps, structural deep-clone, and media-copy isolation all survive.editExportRoundtrip.test.ts—fromBuffer → insertText/insertTable → toBuffernow succeeds and yields a valid.docx(PK zip); this threw before.1459 pass, 0 fail.typecheck+lintclean.Notes
cloneDocument— no public API surface change.patch,@eigenpal/docx-editor-core).Need help on this PR? Tag
/codesmithwith what you need. Autofix is disabled.