Skip to content

Docx reader: bookmark in header not working when last parPart was a bookmark #9626

Closed
@mbrackeantidot

Description

@mbrackeantidot

Explain the problem.
When transforming this document: bookmark_simplified.docx with pandoc using pandoc bookmark_simplified.docx, the link at the start of the file is not correctly transformed.

Expected ouptut:

<p><a href="#referenced-title">A link to the bookmark</a></p>
<p><strong>Accomplices:</strong></p>
<p>There are bookmarks around here.</p>
<h1 id="referenced-title">Referenced title</h1>
<p>Content</p>

Actual output:

<p><a href="#_APPENDIX_5:_FINANCIAL">A link to the bookmark</a></p>
<p><strong>Accomplices:</strong></p>
<p>There are bookmarks around here.</p>
<h1 id="referenced-title">Referenced title</h1>
<p>Content</p>

I found out that this is because the last paragraph part before the header is a bookmark. In this case, parPartToInlines' doesn't create a span for the bookmark because there is a bookmark immediately before the header (represented by docxImmedPrevAnchor) and doesn't add the bookmark to the anchor map docxAnchorMap because it's in a header, which is handled by makeHeaderAnchor'. But makeHeaderAnchor' assumes that a span with an anchor was added by parPartToInlines'.

To verify my understanding of the problem, I implemented a fix and tested it manually. If you're curious, the code can be found here: 09fc204

I would like to create a proper commit containing that fix and make a pull request. Do I have your blessing?

Pandoc version?
Latest development version, compiled from source on macOS. Older versions on Linux and macOS had the same bug.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions