Skip to content

Duplicated last line in CSV.foreach #279

@GabrielNagy

Description

@GabrielNagy

I found a bug that only reproduces with a very specific set of prerequisites (all of the following must be true):

  • file with CRLF endings
  • file with no EOL/trailling newline (removed using a hex editor since vim always adds them back)
  • file larger than 32768 bytes
  • CSV.foreach
  • option strip: true
  • option skip_lines: /\A,+\n?\z/

The following example (where original.csv is a file containing the line AAAA1234567890 ~2500 times):

CSV.foreach('original.csv', strip: true, skip_lines: /\A,+\n?\z/) do |data|
  puts data[0]
end

will print the last line duplicated:

AAAA1234567890
AAAA1234567890
AAAA1234567890
AAAA1234567890
AAAA1234567890AAAA1234567890
...

As a workaround I used CSV.parse(File.read, ...) with the same options, but I still wanted to flag this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions