Skip to content

The character sequence "]]" is not written inside cdata section. One "]" is discarded. #100

Open
@lepokle

Description

@lepokle

Problem

Assume you have a cdata section with the following content

  This is a ]] > test.

(This is how Atlassian Confluence "escapes" end of CDATA within a CDATA section).
If you write this sequence using XMLStreamWriter2.writeCdata you'll the the following section in the output file

This is a ] > test.

Also the initial sequence should be fine from XML point of view.

Reason

Inside ByteXmlWriter.writeCDataContents each character is checked and written to output. If it comes to a ] character it checks the next character for another ] and a finally a >. In this case the end sequence is properly handled.
In case that any other character after the second ] is detected, it will continue writing characters but without writing the first ] character to the output buffer. So it gets lost.

Solution

The other case (no > character follows the double ]) must be properly handled by writing the first detected ] to the output as well:
(line 851)

if (offset < len && cbuf[offset] == ']') {
    if ((offset+1) < len && cbuf[offset+1] == '>') {
        // Ok, need to output ']]' first, then end
        offset += 2;
        writeRaw(BYTE_RBRACKET, BYTE_RBRACKET);
        writeCDataEnd();
        // Then new start, and '>'
        writeCDataStart();
        writeRaw(BYTE_GT);
    }
     else {
        // no end found, write first bracket
        if (_outputPtr >= _outputBufferLen) {
            flushBuffer();
        }
        _outputBuffer[_outputPtr++] = (byte) ch;
    }
    continue main_loop;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions