Skip to content

Conversation

@ayushtkn
Copy link
Member

What changes were proposed in this pull request?

For invalid entries get_variant() doesn't return null

Why are the changes needed?

To correct the behaviour of to_variant()

Does this PR introduce any user-facing change?

Yes, get_variant() throws instead of returning null

How was this patch tested?

UT

@sonarqubecloud
Copy link

@@ -0,0 +1 @@
SELECT variant_get(parse_json('{"a": 1}'), '$[0]'); No newline at end of file
Copy link
Member

@deniskuzZ deniskuzZ Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it better to do this in itest?

@deniskuzZ
Copy link
Member

deniskuzZ commented Nov 17, 2025

not sure about this, similar change in get_json_object implemented by https://issues.apache.org/jira/browse/HIVE-29277 returns NULL
cc @wecharyu

variant_get(parse_json('{"a": 1}'), '$[0]')

✔ Expected result

NULL (not an error) — because the path '$[0]' does not exist in the JSON structure.

This is consistent with typical JSON-path behavior:
• navigating an invalid path returns NULL

there is a difference between the path that is valid, but matches nothing and invalid syntax

@ayushtkn
Copy link
Member Author

ayushtkn commented Nov 17, 2025

in our case: try_variant_get should return null, in case someone wants null he should use try_variant_get

    name = "try_variant_get",
    value = "_FUNC_(variant, path[, type]) - Extracts a sub-variant from variant according to path, and casts it to type. Returns null on error.",
    extended = """

@deniskuzZ
Copy link
Member

deniskuzZ commented Nov 17, 2025

in our case: try_variant_get should return null, in case someone wants null he should use try_variant_get

    name = "try_variant_get",
    value = "_FUNC_(variant, path[, type]) - Extracts a sub-variant from variant according to path, and casts it to type. Returns null on error.",
    extended = """

I imagined it the following way:

✅ variant_get

Throws an error if the extraction fails

Errors happen when:

  • JSONPath is invalid (syntax error)

    variant_get(col, '$.a[') -- broken bracket
    variant_get(col, '$..a') -- unsupported recursive descent
    variant_get(col, '$[abc]') -- invalid array index
    variant_get(col, '$.a..b') -- illegal step

  • The value cannot be converted to the requested type

    variant_get(parse_json('{"city": "Budapest"}'), '$.city', 'int') -- can't cast String to int

  • The variant structure is malformed

    variant_get(parse_json('{"city": "Buda'), '$.city') -- malformed json

If the path is valid but does not match anything, Iceberg returns NULL (same as normal JSONPath behavior)

✔️ Safe for normal use
✖️ Not safe if the data may be malformed

✅ try_variant_get

Never throws an error

If anything goes wrong, it returns NULL

It swallows:

  • Invalid JSONPath syntax

  • Type conversion failures

  • Unexpected data types

  • Corrupted or partial JSON/VARIANT values

✔️ Safe for messy/unknown data
✔️ Useful for schema evolution
✔️ Good in SELECT queries where one bad row would otherwise fail the job

@ayushtkn
Copy link
Member Author

I am confused. Isn't this what we fix here: variant_get was returning null instead of an error

@deniskuzZ
Copy link
Member

I am confused. Isn't this what we fix here: variant_get was returning null instead of an error

variant_get(parse_json('{"a": 1}'), '$[0]') should return NULL, not error. I've added examples, please take a look

@ayushtkn
Copy link
Member Author

ohhk, you mean to say the current behaviour is correct, we need not to changed anything?

@ayushtkn ayushtkn closed this Nov 17, 2025
@deniskuzZ
Copy link
Member

deniskuzZ commented Nov 17, 2025

ohhk, you mean to say the current behaviour is correct, we need not to changed anything?

it's probably correct for the empty path case, but not for the rest of scenarios (JSONPath is invalid, unsupported type cast, etc) where we return NULL.
Could we check the same in Spark or ask Spark folks to run these scenarios for us?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants