Description
I'm using the cobrix library to read an EBDCID binary file, obtained as a table extraction on DB2. The reading operation output is writed as CSV file on HDFS.
For parsing the binary file I'm using a copybook file present on HDFS.
Here is the spark reading command:
spark.read.
format (sourceFileFormat).
option ("copybook", copybookStagingHDFSPath).
option ("schema_retention_policy", "collapse_root").
option ("string_trimming_policy", "none").
load (sourceFileStagingHDFSPath + file)
The resulting output on CSV file does not allow the distinction between BLANK ("") and NULL cells but are both treated as empty strings.
Is there a way to treat BLANK ("") value and NULL value differently, in order to have an output corresponding to the data present on the DB2 source database?