Commit 4bf9be3
PARQUET-136: NPE thrown in StatisticsFilter when all values in a string/binary column trunk are null
In case of all nulls in a binary column, statistics object read from file metadata is empty, and should return true for all nulls check for the column. Even if column has no values, it can be ignored.
The other way is to fix this behaviour in the writer, but is that what we want ?
Author: Yash Datta <Yash.Datta@guavus.com>
Author: Alex Levenson <alexlevenson@twitter.com>
Author: Yash Datta <saucam@gmail.com>
Closes apache#99 from saucam/npe and squashes the following commits:
5138e44 [Yash Datta] PARQUET-136: Remove unreachable block
b17cd38 [Yash Datta] Revert "PARQUET-161: Trigger tests"
82209e6 [Yash Datta] PARQUET-161: Trigger tests
aab2f81 [Yash Datta] PARQUET-161: Review comments for the test case
2217ee2 [Yash Datta] PARQUET-161: Add a test case for checking the correct statistics info is recorded in case of all nulls in a column
c2f8d6f [Yash Datta] PARQUET-161: Fix the write path to write statistics object in case of only nulls in the column
97bb517 [Yash Datta] Revert "revert TestStatisticsFilter.java"
a06f0d0 [Yash Datta] Merge pull request apache#1 from isnotinvain/alexlevenson/PARQUET-161-136
b1001eb [Alex Levenson] Fix statistics isEmpty, handle more edge cases in statistics filter
0c88be0 [Alex Levenson] revert TestStatisticsFilter.java
1ac9192 [Yash Datta] PARQUET-136: Its better to not filter chunks for which empty statistics object is returned. Empty statistics can be read in case of 1. pre-statistics files, 2. files written from current writer that has a bug, as it does not write the statistics if column has all nulls
e5e924e [Yash Datta] Revert "PARQUET-136: In case of all nulls in a binary column, statistics object read from file metadata is empty, and should return true for all nulls check for the column"
8cc5106 [Yash Datta] Revert "PARQUET-136: fix hasNulls to cater to the case where all values are nulls"
c7c126f [Yash Datta] PARQUET-136: fix hasNulls to cater to the case where all values are nulls
974a22b [Yash Datta] PARQUET-136: In case of all nulls in a binary column, statistics object read from file metadata is empty, and should return true for all nulls check for the column1 parent d70fdbc commit 4bf9be3
File tree
11 files changed
+152
-47
lines changed- parquet-column/src/main/java/parquet/column/statistics
- parquet-hadoop/src
- main/java/parquet
- filter2/statisticslevel
- format/converter
- test/java/parquet
- filter2/statisticslevel
- hadoop
11 files changed
+152
-47
lines changedLines changed: 7 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
64 | 64 | | |
65 | | - | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
66 | 68 | | |
67 | 69 | | |
68 | 70 | | |
| |||
100 | 102 | | |
101 | 103 | | |
102 | 104 | | |
103 | | - | |
| 105 | + | |
Lines changed: 6 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
64 | 64 | | |
65 | | - | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
66 | 68 | | |
67 | 69 | | |
68 | 70 | | |
| |||
Lines changed: 6 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
64 | 64 | | |
| 65 | + | |
| 66 | + | |
65 | 67 | | |
66 | 68 | | |
67 | 69 | | |
| |||
100 | 102 | | |
101 | 103 | | |
102 | 104 | | |
103 | | - | |
| 105 | + | |
Lines changed: 5 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
64 | 64 | | |
| 65 | + | |
| 66 | + | |
65 | 67 | | |
66 | 68 | | |
67 | 69 | | |
| |||
Lines changed: 5 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
64 | 64 | | |
| 65 | + | |
| 66 | + | |
65 | 67 | | |
66 | 68 | | |
67 | 69 | | |
| |||
Lines changed: 6 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
64 | 64 | | |
| 65 | + | |
| 66 | + | |
65 | 67 | | |
66 | 68 | | |
67 | 69 | | |
| |||
100 | 102 | | |
101 | 103 | | |
102 | 104 | | |
103 | | - | |
| 105 | + | |
Lines changed: 19 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| |||
142 | 142 | | |
143 | 143 | | |
144 | 144 | | |
145 | | - | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
146 | 149 | | |
147 | 150 | | |
148 | 151 | | |
| |||
220 | 223 | | |
221 | 224 | | |
222 | 225 | | |
223 | | - | |
| 226 | + | |
224 | 227 | | |
225 | 228 | | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
226 | 240 | | |
227 | | - | |
| 241 | + | |
228 | 242 | | |
229 | 243 | | |
230 | 244 | | |
0 commit comments