Skip to content

Commit a87015e

Browse files
committed
[SPARK-47125][SQL] Return null if Univocity never triggers parsing
### What changes were proposed in this pull request? This PR proposes to prevent `null` for `tokenizer.getContext`. This is similar with #28029. `getContext` seemingly via the univocity library, it can return null if `begingParsing` is not invoked (https://github.com/uniVocity/univocity-parsers/blob/master/src/main/java/com/univocity/parsers/common/AbstractParser.java#L53). This can happen when `parseLine` is not invoked at https://github.com/apache/spark/blob/e081f06ea401a2b6b8c214a36126583d35eaf55f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala#L300 - `parseLine` invokes `begingParsing`. ### Why are the changes needed? To fix up a bug. ### Does this PR introduce _any_ user-facing change? Yes. In a very rare case, when `CsvToStructs` is used as a sole predicate against an empty row, it might trigger NPE. This PR fixes it. ### How was this patch tested? Manually tested, but test case will be done in a separate PR. We should backport this to all branches. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45210 from HyukjinKwon/SPARK-47125. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent 65fe9ef commit a87015e

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,7 @@ class UnivocityParser(
136136

137137
// Retrieve the raw record string.
138138
private def getCurrentInput: UTF8String = {
139+
if (tokenizer.getContext == null) return null
139140
val currentContent = tokenizer.getContext.currentParsedContent()
140141
if (currentContent == null) null else UTF8String.fromString(currentContent.stripLineEnd)
141142
}

0 commit comments

Comments
 (0)