Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null vs Blank Issue #372

Closed
ilaria-ia opened this issue Jan 26, 2021 · 35 comments · Fixed by #400
Closed

Null vs Blank Issue #372

ilaria-ia opened this issue Jan 26, 2021 · 35 comments · Fixed by #400
Assignees
Labels
accepted Accepted for implementation enhancement New feature or request question Further information is requested

Comments

@ilaria-ia
Copy link

I'm using the cobrix library to read an EBDCID binary file, obtained as a table extraction on DB2. The reading operation output is writed as CSV file on HDFS.
For parsing the binary file I'm using a copybook file present on HDFS.
Here is the spark reading command:
spark.read.
format (sourceFileFormat).
option ("copybook", copybookStagingHDFSPath).
option ("schema_retention_policy", "collapse_root").
option ("string_trimming_policy", "none").
load (sourceFileStagingHDFSPath + file)
The resulting output on CSV file does not allow the distinction between BLANK ("") and NULL cells but are both treated as empty strings.

Is there a way to treat BLANK ("") value and NULL value differently, in order to have an output corresponding to the data present on the DB2 source database?

@ilaria-ia ilaria-ia added the question Further information is requested label Jan 26, 2021
@yruslan
Copy link
Collaborator

yruslan commented Jan 27, 2021

Hi,

Cobrix does distinguish between nulls and empty values.
It's up to the CSV writer to write different values for nulls and empty values.

You can achieve this by specifying a value that should be written to a CSV field when the input value is null.

df.write
  .option("nullValue", "NULL")
  .csv("/some/path")

@ilaria-ia
Copy link
Author

Hi,

Thanks for the reply. I tested the writing mode you recommended, but the problem did not go away. Executing a show on the dataframe created by the read operation and comparing it with the source table on DB2, it was verified that the null cells on the source are read blank in the dataframe. So the error is being read not writing. In fact, since null cells are not created in the dataframe, the option option ("nullValue", "NULL") does not generate null values ​​in the csv written on HDFS.
For the sake of completeness, I report an example to help understand the difference between db2 and post-reading dataframe.

Examble_DB2vsDataframePostRead.xlsx

This is read operation:
val sourceFileDF: (Int, DataFrame) = Try.apply({
spark.read.
format(sourceFileFormat).
option("copybook", copybookStagingHDFSPath).
option("schema_retention_policy", "collapse_root").
option("string_trimming_policy", "left").
option("inferSchema",false).
load(sourceFileStagingHDFSPath + file)

Is it possible that I need to add special options to fix the problem?

@yruslan
Copy link
Collaborator

yruslan commented Jan 29, 2021

I see. What are the data types of the problematic columns? How their PICs look like in teh copybook?

I have a suspicion that it might be the limitation of the mainframe format itself. It might be that empty strings are encoded exactly the same as nulls on the export from DB2. If this is the case, there is nothing we can do.

@ilaria-ia
Copy link
Author

Data tyoes and PICs of columns are:
campo1 PIC X(20). CHAR(20) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).
campo2 PIC X(4). CHAR(4) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).
campo3 PIC X(102). VARCHAR(100) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).
campo4 PIC X(8). CHAR(8) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).
campo5 PIC X(2). CHAR(2) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).
campo6 PIC X(3). CHAR(3) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).
campo7 PIC X(1). CHAR(1) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).
campo8 PIC X(10). DATE WITH DEFAULT NULL
FILLER PIC X(1).
campo9 PIC X(20). CHAR(20) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).
campo10 PIC X(102). VARCHAR(100) FOR SBCS DATA WITH DEFAULT NULL
FILLER PIC X(1).

@yruslan
Copy link
Collaborator

yruslan commented Jan 29, 2021

That's what I was suspected. Fields that are encoded as 'PIC X' don't support the notion of NULLs. So when you export a table from DB2 to a mainframe file, both empty strings and nulls are encoded as "40404040" (40 is space in EBCDIC). If encoded values are different, it might be possible to differentiate it.

Could I ask you to run with .option("debug", "true") to see how empty strings and nulls are encoded?

@ilaria-ia
Copy link
Author

ilaria-ia commented Jan 29, 2021

Is this option valid for spark-cobol version 0.2.5? I have spark 2.2.0

@yruslan
Copy link
Collaborator

yruslan commented Jan 29, 2021

No, this option has been introduced quite recently. But you can try spark-cobol 2.2.0 with Spark 2.2.0.

@ilaria-ia
Copy link
Author

Unfortunately, with version 2.2.0 I always have this error which I cannot resolve:
Exception in thread "main" java.lang.NoClassDefFoundError: org/antlr/v4/runtime/CharStreams

@yruslan
Copy link
Collaborator

yruslan commented Jan 29, 2021

This might be easy to solve. :)
Get this jar: antlr4-runtime-4.7.2.jar and add it to spark-submit / spark-shell with --jars

@ilaria-ia
Copy link
Author

I did this, but error is
INFO source.DefaultSource: Cobrix 'spark-cobol' build 2.2.0 (2020-12-30T10:25:41)
ANTLR Tool version 4.7.2 used for code generation does not match the current runtime version 4.5.3ANTLR Runtime version 4.7.2 used for parser compilation does not match the current runtime version 4.5.3Exception in thread "main" java.lang.ExceptionInInitializerError

@yruslan
Copy link
Collaborator

yruslan commented Jan 29, 2021

It's strange. What's your command line?

@yruslan
Copy link
Collaborator

yruslan commented Jan 29, 2021

I checked. It seems older Spark versions used different ANTLR and they provide it in runtime.
You can solve this by compiling the latest Cobrix from sources replacing Spark version.

But that won't solve nulls issue. Alternatively, you can use a HEX editor to look into the data file and check if nulls are encoded differently than empty strings.

@ilaria-ia
Copy link
Author

If i replace FILLER with this format copybook:
FIELD PIC X(20).
FIELD-NULL REDEFINES
FIELD PIC X(20).
do i have to set a particular option?

@yruslan
Copy link
Collaborator

yruslan commented Jan 30, 2021

No additional options are necessary here. But what are you trying to achieve?

@ilaria-ia
Copy link
Author

I created the copybook file starting from DDL and LOAD file which indicates datatype, number of occupied positions and NULLIF position for each field. I translated NULLIF with FILLER in copybook, and I want and I would like to try to use REDEFINES instead of FILLER, to understand if the problem is the copybook file used.

@ilaria-ia
Copy link
Author

ilaria-ia commented Feb 2, 2021

Hi,
I tried to use the copy file attached. The cobrix library gives me the following error:
scala.MatchError: 5 (of class java.lang.Integer)
at za.co.absa.cobrix.cobol.parser.encoding.EBCDIC.codec(EBCDIC.scala:34)
at za.co.absa.cobrix.cobol.parser.ast.Primitive.getBinarySizeBits(Primitive.scala:90)
at za.co.absa.cobrix.cobol.parser.CopybookParser$.za$co$absa$cobrix$cobol$parser$CopybookParser$$calculatePrimitiveSize$1(CopybookParser.scala:203)
at za.co.absa.cobrix.cobol.parser.CopybookParser$$anonfun$calculateSchemaSizes$1.apply$mcVI$sp(CopybookParser.scala:233)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at za.co.absa.cobrix.cobol.parser.CopybookParser$.calculateSchemaSizes(CopybookParser.scala:214)
at za.co.absa.cobrix.cobol.parser.CopybookParser$.za$co$absa$cobrix$cobol$parser$CopybookParser$$calculateGroupSize$1(CopybookParser.scala:195)
at za.co.absa.cobrix.cobol.parser.CopybookParser$$anonfun$calculateSchemaSizes$1.apply$mcVI$sp(CopybookParser.scala:232)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at za.co.absa.cobrix.cobol.parser.CopybookParser$.calculateSchemaSizes(CopybookParser.scala:214)
at za.co.absa.cobrix.cobol.parser.CopybookParser$.za$co$absa$cobrix$cobol$parser$CopybookParser$$calculateGroupSize$1(CopybookParser.scala:195)
at za.co.absa.cobrix.cobol.parser.CopybookParser$$anonfun$calculateSchemaSizes$1.apply$mcVI$sp(CopybookParser.scala:232)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at za.co.absa.cobrix.cobol.parser.CopybookParser$.calculateSchemaSizes(CopybookParser.scala:214)
at za.co.absa.cobrix.cobol.parser.CopybookParser$.calculateBinaryProperties(CopybookParser.scala:181)
at za.co.absa.cobrix.cobol.parser.CopybookParser$.parseTree(CopybookParser.scala:153)
at za.co.absa.cobrix.spark.cobol.reader.fixedlen.FixedLenNestedReader.loadCopyBook(FixedLenNestedReader.scala:74)
at za.co.absa.cobrix.spark.cobol.reader.fixedlen.FixedLenNestedReader.(FixedLenNestedReader.scala:41)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.createFixedLengthReader(DefaultSource.scala:82)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.buildEitherReader(DefaultSource.scala:69)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.createRelation(DefaultSource.scala:50)
at za.co.absa.cobrix.spark.cobol.source.DefaultSource.createRelation(DefaultSource.scala:45)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:307)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:156)
... 48 elided

Should I enter additional options for this copybook?

table.txt

@yruslan
Copy link
Collaborator

yruslan commented Feb 7, 2021

Hi, Sorry for the late reply. What version of Cobrix are you using? and what exact options are you passing to Spark?

This is a very strange error.

@ilaria-ia
Copy link
Author

These are dependencies:

<dependency>
    <groupId>za.co.absa.cobrix</groupId>
    <artifactId>spark-cobol</artifactId>
    <version>0.2.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scodec/scodec-core -->
<dependency>
    <groupId>org.scodec</groupId>
    <artifactId>scodec-core_2.11</artifactId>
    <version>1.10.3</version>
</dependency>

<!-- https://mvnrepository.com/artifact/org.scodec/scodec-bits -->
<dependency>
    <groupId>org.scodec</groupId>
    <artifactId>scodec-bits_2.11</artifactId>
    <version>1.1.4</version>
</dependency>

<!-- https://mvnrepository.com/artifact/za.co.absa.cobrix/cobol-parser -->
<dependency>
    <groupId>za.co.absa.cobrix</groupId>
    <artifactId>cobol-parser</artifactId>
    <version>0.2.5</version>
</dependency>

Spark read command is:
spark.read.
format("za.co.absa.cobrix.spark.cobol.source").
option("copybook", "file:///.../file.copy").
option("schema_retention_policy","collapse_root").
load("/path/file_20210119140000")

@yruslan
Copy link
Collaborator

yruslan commented Feb 9, 2021

Before I take a look. Why do you use 0.2.5? The latest version that does not use ANTLR runtime is 0.5.6. Please, try 0.5.6 first.

Also, you don't need to include scodec and cobol-parser deendencies. They are transitive and will be added automatically.

@ilaria-ia
Copy link
Author

Thanks a lot for the tip. I changed the version and the error changed. The mistake is:

There are some files in hdfs://path/fileBin that are NOT DIVISIBLE by the RECORD SIZE calculated from the copybook (594 bytes per record)._

I used this command:
spark.read.
format("za.co.absa.cobrix.spark.cobol.source").
option("copybook", copybookStagingHDFSPath).
option("schema_retention_policy", "collapse_root").
load(sourceFileStagingHDFSPath + file)

This copybook file I use is the official one of the DB2 table. For this reason it is strange that it does not match the binary file.
Do you have any suggestions?

@yruslan
Copy link
Collaborator

yruslan commented Feb 10, 2021

You can try

.option("debug_ignore_file_size", "true")

to suppress the message to fetch first several records (e.g. df.show()). But it won't work when reading the full file.

It's possible that tolls that transfer mainframe files to a PC add additional headers to records (RDW, for instance). So without looking at the actual file it is hard to tell why the data doesn't match the copybook. Maybe the above workaround will give you a clue. I'd strange that you didn't have the same error message when loading via '0.2.5'.

I addition what you can do is compare the layout of fields with the copybook or a similar layout produced in the mainframe to ensure they match.
Also, you can validate that the record size calculated by Cobrix (594 bytes) is correct by opening the file using a HEX editor and checking the data there.

@ilaria-ia
Copy link
Author

From hex editor the file size is 626. In fact with a different copybook (using fillers instead of refefines) I read the binary file correctly and the size from cobrix is ​​626.
My initial problem (blank vs null), I thought I could solve it using another copybook format, i.e. with the redefines but this only led me to no longer read the binary file.

@yruslan
Copy link
Collaborator

yruslan commented Feb 10, 2021

We might include the saded version of ANTLR Runtime in the future version of Cobrix so people could use the latest spark-cobol with older Spark versions.

But back to the original issue. Can you tell me if nulls are encoded differently than blanks? Maybe you can send a record (or a HEX dump screenshot) and I'll take a look. I need a HEX dump of a record and a description which field should be null and which should be blank.

@ilaria-ia
Copy link
Author

Hi, this is binary file with two records that contain blank and null:
TABLE.zip

This file contains datatype of columns in DB2:
ddl_table.txt

This is copybook file with FILLER:
copybook_filler_table.COPY.txt

and this is copybook file with REDEFINES:
table.txt

Thanks for the support.

@yruslan
Copy link
Collaborator

yruslan commented Feb 11, 2021

Great. I've tried to load it using the latest Cobrix:

+------+------------------+------------------+------+------+------+------+------+------+-------+-------+--------------------------------------------------------------------------------------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|FIELD1|FIELD2            |FIELD3            |FIELD4|FIELD5|FIELD6|FIELD7|FIELD8|FIELD9|FIELD10|FIELD11|FIELD12                                                                                     |FIELD13|FIELD14|FIELD15|FIELD16|FIELD17|FIELD18|FIELD19|FIELD20|FIELD21|FIELD22|FIELD23|FIELD24|FIELD25|FIELD26|FIELD27|
+------+------------------+------------------+------+------+------+------+------+------+-------+-------+--------------------------------------------------------------------------------------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|51    |200811251121143139|201111181005012870|null  |      |?     |TEST  |null  |      |?      |?      |XXX  YYY           ?   ?     ?25.11.2008           ?   K  ZZZ  a         ABCDEFG    a &    S|       |       |?      |null   |null   |null   |null   |null   |null   |null   |null   |null   |null   |null   |null   |
|51    |201111181005012870|999912312359599999|null  |      |?     |TEST  |null  |      |?      |       |XXX  YYY           ?   ?     ?25.11.2008           ?   K  ZZZ    a &  *  ABCDEFG    a &  * S|       |       |?      |null   |null   |null   |null   |null   |null   |null   |null   |null   |null   |null   |null   |
+------+------------------+------------------+------+------+------+------+------+------+-------+-------+--------------------------------------------------------------------------------------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+

Is this the expected output or are there unexpected blanks?

I used the following options:

    val df = spark
      .read
      .format("cobol")
      .option("copybook", "copybook_filler_table.COPY.txt")
      .option("schema_retention_policy", "collapse_root") 
      .option("record_length", "395")
      .load("TABLE.HELP")

Here is the same with debugging:

+------+------------+------------------+--------------------+------------------+--------------------+------+------------+------+------------+------+----------------------------------------+------+------------+------+------------+------+------------+-------+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+
|FIELD1|FIELD1_debug|FIELD2            |FIELD2_debug        |FIELD3            |FIELD3_debug        |FIELD4|FIELD4_debug|FIELD5|FIELD5_debug|FIELD6|FIELD6_debug                            |FIELD7|FIELD7_debug|FIELD8|FIELD8_debug|FIELD9|FIELD9_debug|FIELD10|FIELD10_debug|FIELD11|FIELD11_debug                                                                                                                                                                                               |FIELD12                                                                                     |FIELD12_debug                                                                                                                                                                                               |FIELD13|FIELD13_debug                                                                                                                                                                                               |FIELD14|FIELD14_debug   |FIELD15|FIELD15_debug|FIELD16|FIELD16_debug|FIELD17|FIELD17_debug|FIELD18|FIELD18_debug|FIELD19|FIELD19_debug|FIELD20|FIELD20_debug|FIELD21|FIELD21_debug|FIELD22|FIELD22_debug|FIELD23|FIELD23_debug|FIELD24|FIELD24_debug|FIELD25|FIELD25_debug|FIELD26|FIELD26_debug|FIELD27|FIELD27_debug|
+------+------------+------------------+--------------------+------------------+--------------------+------+------------+------+------------+------+----------------------------------------+------+------------+------+------------+------+------------+-------+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+
|51    |000000051C  |200811251121143139|0200811251121143139C|201111181005012870|0201111181005012870C|null  |00000000F5  |      |40400000    |?     |0000000000000000000000000000000000006F00|TEST  |E3C5E2E3    |null  |000CF0      |      |40404000    |?      |0000006F     |?      |00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000006F|XXX  YYY           ?   ?     ?25.11.2008           ?   K  ZZZ  a         ABCDEFG    a &    S|03E7E7E70003E8E8E840404040404040400000006F0000006F00000C00006FF2F54BF1F14BF2F0F0F800000000000000000000006F000004D20003E9E9E90200811251121143690C0007C1C2C3C4C5C6C70201111181005013072C00E2404040404040404040|       |404040404040404040000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000|       |0000000000000000|?      |6F           |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |
|51    |000000051C  |201111181005012870|0201111181005012870C|999912312359599999|0999912312359599999C|null  |00000000F5  |      |40400000    |?     |0000000000000000000000000000000000006F00|TEST  |E3C5E2E3    |null  |000CF0      |      |40404000    |?      |0000006F     |       |000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000|XXX  YYY           ?   ?     ?25.11.2008           ?   K  ZZZ    a &  *  ABCDEFG    a &  * S|03E7E7E70003E8E8E840404040404040400000006F0000006F00000C00006FF2F54BF1F14BF2F0F0F800000000000000000000006F000004D20003E9E9E90201111181005013095C0007C1C2C3C4C5C6C70201111181005013095C00E2404040404040404040|       |404040404040404040000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000|       |0000000000000000|?      |6F           |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |null   |null         |
+------+------------+------------------+--------------------+------------------+--------------------+------+------------+------+------------+------+----------------------------------------+------+------------+------+------------+------+------------+-------+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+-------+-------------+

@yruslan
Copy link
Collaborator

yruslan commented Feb 11, 2021

Here is Cobrix 2.2.0 that embeds ANTLR and can be used with Spark 2.2.0 :)
cobrix_2_2_0_spark_2_2_0.jar.zip

@ilaria-ia
Copy link
Author

Hi,

thanks for the jar.

The result is not correct. Here I report the data on DB2 corresponding to the dump file sent previously.

tableInsertDB2.zip

How did you calculate the value of record_lenght?

@yruslan
Copy link
Collaborator

yruslan commented Feb 12, 2021

I calculated the record size by dividing the file size by 2, since I knew the file has 2 records.

Looking at your SQL statement, it seems your copybook doesn't match the data.
For instance, 00 00 00 00 00 F5 is not a valid COMP-3 number. It is in DISPLAY format (or no 'COMP' usage).
Screenshot 2021-02-12 at 14 00 27

After I changed the copybook to:

      *    10  FILLER PIC X(1).
      *    10 FIELD4 PIC S9(9) COMP-3.
          10 FIELD4 PIC S9(6) DISPLAY.

(removed the FILLER and changed the type of FIELD4)

The field looks OK:
Screenshot 2021-02-12 at 14 02 51

You need to go though your fields one by one and make sure the size and the type of each field is defined properly. Otherwise, there is not much Cobrix can do.

If you have difficulties with particular fields, let me know and I can help. But it's up to you to make the copybook match the data.

@ilaria-ia
Copy link
Author

Hi,

The previously submitted dump was inconsistent with the submitted copybook. For this reason the resulting data was incorrect.
I was able to debug every single column on dump file with option debug=true, and I found that fields containing blank and null have a different encoding (null --> 0, blank--> 40). But unfortunately in the dataframe after reading, I only see blank.
Fields, that contain blank and null, in the copybook file are all of the PIC X () type and with FILLER. You said in a previous post that maybe there is a way to tell them apart by having a different encoding.

@yruslan yruslan self-assigned this Feb 18, 2021
@yruslan yruslan added accepted Accepted for implementation enhancement New feature or request labels Feb 18, 2021
@yruslan
Copy link
Collaborator

yruslan commented Feb 18, 2021

Great, that's the difference we can use to distinguish between banks and nulls.
We'll implement an option, say:

.option("improved_null_detection","true")

I can send you a snapshot version to test before releasing a new version.

@ilaria-ia
Copy link
Author

Thank you very much. If you can send it, it can be useful to test my data.

@yruslan
Copy link
Collaborator

yruslan commented Feb 19, 2021

Cool, will send as soon as it is ready

@yruslan
Copy link
Collaborator

yruslan commented Feb 19, 2021

Here is the snapshot:
cobrix_2_2_0_spark_2_2_1-SNAPSHOT.zip

Use

      .option("pedantic", "true")
      .option("improved_null_detection", "true")

Use 'pedantic' to make Cobrix complain if there are redundant options.

@yruslan
Copy link
Collaborator

yruslan commented Feb 24, 2021

Have you had a chance to check the new option?

@yruslan
Copy link
Collaborator

yruslan commented Jul 8, 2021

Sorry, it seems the fix wasn't included in 2.2.1. Will be included in 2.2.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Accepted for implementation enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants