Reading CSV file returns incorrect line break content

When multiple sequential new line characters appear inside a quoted CSV field, Pandoc coalesces these into a single `SoftBreak` in the resulting AST. According to [RFC 4180][rfc], this would seem to be incorrect behavior. The RFC's grammar treats CR and LF like any other character inside a quoted field.

Shouldn't individual `LineBreak`s be returned for `\r\n\r\n\r\n` rather than a single `SoftBreak` by the CSV reader?

At minimum, I would think there should be no information loss during the read, which means encoding the original number of line breaks in some way. Currently, it's not possible to reconstruct the input data accurately from the AST.


Tested with Pandoc 3.1.13

[rfc]: https://tools.ietf.org/html/rfc4180

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Reading CSV file returns incorrect line break content #9797

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Reading CSV file returns incorrect line break content #9797

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions