Commit 976ac5e
[SPARK-26303][SQL] Return partial results for bad JSON records
## What changes were proposed in this pull request?
In the PR, I propose to return partial results from JSON datasource and JSON functions in the PERMISSIVE mode if some of JSON fields are parsed and converted to desired types successfully. The changes are made only for `StructType`. Whole bad JSON records are placed into the corrupt column specified by the `columnNameOfCorruptRecord` option or SQL config.
Partial results are not returned for malformed JSON input.
## How was this patch tested?
Added new UT which checks converting JSON strings with one invalid and one valid field at the end of the string.
Closes apache#23253 from MaxGekk/json-bad-record.
Lead-authored-by: Maxim Gekk <max.gekk@gmail.com>
Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>1 parent d0a8b20 commit 976ac5e
File tree
8 files changed
+67
-26
lines changed- docs
- python/pyspark/sql
- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst
- json
- util
- core/src
- main/scala/org/apache/spark/sql
- streaming
- test/scala/org/apache/spark/sql/execution/datasources/json
8 files changed
+67
-26
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
41 | 43 | | |
42 | 44 | | |
43 | 45 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
211 | 211 | | |
212 | 212 | | |
213 | 213 | | |
214 | | - | |
| 214 | + | |
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| |||
424 | 424 | | |
425 | 425 | | |
426 | 426 | | |
427 | | - | |
| 427 | + | |
428 | 428 | | |
429 | 429 | | |
430 | 430 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
441 | 441 | | |
442 | 442 | | |
443 | 443 | | |
444 | | - | |
| 444 | + | |
445 | 445 | | |
446 | 446 | | |
447 | 447 | | |
| |||
648 | 648 | | |
649 | 649 | | |
650 | 650 | | |
651 | | - | |
| 651 | + | |
652 | 652 | | |
653 | 653 | | |
654 | 654 | | |
| |||
Lines changed: 20 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
32 | | - | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| |||
347 | 347 | | |
348 | 348 | | |
349 | 349 | | |
| 350 | + | |
| 351 | + | |
350 | 352 | | |
351 | 353 | | |
352 | 354 | | |
353 | | - | |
354 | | - | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
355 | 362 | | |
356 | 363 | | |
357 | 364 | | |
358 | 365 | | |
359 | 366 | | |
360 | | - | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
361 | 372 | | |
362 | 373 | | |
363 | 374 | | |
| |||
428 | 439 | | |
429 | 440 | | |
430 | 441 | | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
431 | 447 | | |
432 | 448 | | |
433 | 449 | | |
Lines changed: 10 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
23 | 33 | | |
24 | 34 | | |
25 | 35 | | |
| |||
Lines changed: 8 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
362 | 362 | | |
363 | 363 | | |
364 | 364 | | |
365 | | - | |
| 365 | + | |
366 | 366 | | |
367 | 367 | | |
368 | 368 | | |
| |||
598 | 598 | | |
599 | 599 | | |
600 | 600 | | |
601 | | - | |
602 | | - | |
603 | | - | |
604 | | - | |
605 | | - | |
606 | | - | |
607 | | - | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
608 | 608 | | |
609 | 609 | | |
610 | 610 | | |
| |||
Lines changed: 8 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
273 | 273 | | |
274 | 274 | | |
275 | 275 | | |
276 | | - | |
| 276 | + | |
277 | 277 | | |
278 | 278 | | |
279 | 279 | | |
| |||
360 | 360 | | |
361 | 361 | | |
362 | 362 | | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | | - | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
370 | 370 | | |
371 | 371 | | |
372 | 372 | | |
| |||
Lines changed: 14 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
248 | 248 | | |
249 | 249 | | |
250 | 250 | | |
251 | | - | |
| 251 | + | |
252 | 252 | | |
253 | 253 | | |
254 | 254 | | |
| |||
2563 | 2563 | | |
2564 | 2564 | | |
2565 | 2565 | | |
| 2566 | + | |
| 2567 | + | |
| 2568 | + | |
| 2569 | + | |
| 2570 | + | |
| 2571 | + | |
| 2572 | + | |
| 2573 | + | |
| 2574 | + | |
| 2575 | + | |
| 2576 | + | |
| 2577 | + | |
| 2578 | + | |
2566 | 2579 | | |
0 commit comments