Skip to content

Basic text record format returns 0 records #444

@yruslan

Description

@yruslan

Describe the bug

The basic text record format that uses underlying Spark RDDs to split text files efficiently does not produce correct results.

To Reproduce

      val df = spark
        .read
        .format("cobol")
        .option("copybook_contents", copybook)
        .option("record_format", "D2")
        .load(path)

df.count

Returns 0.

      val df = spark
        .read
        .format("cobol")
        .option("copybook_contents", copybook)
        .option("record_format", "D")
        .load(path)

df.count

Returns a non-zero value.

Expected behaviour

The behavior should be the same for 'D' and 'D2' formats if the input file is in basic ASCII format (e.g. 7-bit English text).

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions