Description
cosban opened BATCH-2792 and commented
If a line contains the first character of the line delimiter within N characters of the actual beginning of the line ending delimiter, where N is the length of the line ending delimiter, then the SimpleBinaryBufferedReaderFactory will incorrectly miss the actual line ending and continue reading.
The '#' character is within 5 characters of the actual line ending for this example. This causes the candidateEnding to contain #3#@#
before determining that it is not the correct line ending, appending all of these characters to the line buffer, and continuing.
I suspect it would be better for the method to fail fast the moment that current candidate doesn't match the provided template. i.e. once #3
is determined to not be a possible match of the template going forward.
I have provided a full example of a situation which causes this to fail below:
EXAMPLE OF BUG: lineEnding = `#@#@#`, lines = `Value_1,Value_2,Value#3#@#@#Value_4,Value_5,Value_6`
BAD OUTPUT: `Value_1` `Value_2` `Value#3#@#@#Value_4` `Value_5` `Value_6`
CORRECT OUTPUT: `Value_1` `Value_2` `Value#3`
`Value_4` `Value_5` `Value_6`
No further details from BATCH-2792