Closed
Description
Describe the bug
This is spotted under some very specific options passed to cobrix.
The error is:
Error while encoding: java.lang.RuntimeException:
org.apache.spark.sql.catalyst.expressions.GenericRow is not a valid external type for schema of string
Code snippet that caused the issue
This is the code snippet that causes the error:
val df = spark
.read
.format("cobol")
.option("copybook_contents", copybook)
.option("record_format", "F")
.option("segment_field", "IND")
.option("segment_id_level0", "A")
.option("segment_id_prefix", "ID")
.option("redefine-segment-id-map:0", "SEGMENT1 => A")
.option("redefine-segment-id-map:1", "SEGMENT2 => B")
.option("redefine-segment-id-map:2", "SEGMENT3 => C")
.option("pedantic", "true")
.load("/data/file/location")
(the copybook is provided below)
Expected behavior
spark-cobol
should choose variable-record length reader with fixed record length record extractor if the user requested segment if generation.
Context
- Cobrix version: 2.7.1
- Spark version: 3.3.4
- Scala version: 2.12
Copybook (if possible)
01 R.
05 IND PIC X(1).
05 SEGMENT1.
10 FIELD1 PIC X(1).
05 SEGMENT2 REDEFINES SEGMENT1.
10 FIELD2 PIC X(2).
05 SEGMENT3 REDEFINES SEGMENT1.
10 FIELD3 PIC X(3).