Skip to content

[BUG] NULL value filtering not working correctly #23282

Closed

Description

Hi everybody!

When I import data from cosmos db into databricks, and create temporary view from this data, filter "someField IS NOT NULL" is not working correctly. The only way to make it work is to add one more condition "LOWER(CAST(someField AS STRING)) <> 'null'", but it's not correct, because for this field there is not data with string value, it contains json object or NULL.

`connectionConfig = {
"spark.cosmos.accountEndpoint" : "endpoint",
"spark.cosmos.accountKey" : "key",
"spark.cosmos.database" : "database",
"spark.cosmos.container" : "containter",
"spark.cosmos.read.inferSchema.enabled" : "true"
}

spark
.read
.format("cosmos.oltp")
.options(**connectionConfig)
.load()
.createOrReplaceTempView("tmp_cosmos_data")`

Apache Spark 3.1.1
Library: com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.2.0
Operating System: Ubuntu 18.04.5 LTS
Java: Zulu 8.52.0.23-CA-linux64 (build 1.8.0_282-b08)

If you can't reproduce it on your side with simple execution of mentioned actions please let me know. To fill all the requirements mentioned in this bug report and to meet privacy requirements it's needed to create separate instance of Cosmos DB and so on. Please let me know if it's necessary.

Thanks in advance.

Regards,
Daniil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

ClientThis issue points to a problem in the data-plane of the library.Cosmoscosmos:spark3Cosmos DB Spark3 OLTP Connectorcustomer-reportedIssues that are reported by GitHub users external to the Azure organization.questionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions