Commit 31c4fab
[SPARK-26081][SQL] Prevent empty files for empty partitions in Text datasources
## What changes were proposed in this pull request?
In the PR, I propose to postpone creation of `OutputStream`/`Univocity`/`JacksonGenerator` till the first row should be written. This prevents creation of empty files for empty partitions. So, no need to open and to read such files back while loading data from the location.
## How was this patch tested?
Added tests for Text, JSON and CSV datasource where empty dataset is written but should not produce any files.
Closes #23052 from MaxGekk/text-empty-files.
Lead-authored-by: Maxim Gekk <max.gekk@gmail.com>
Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>1 parent 9a09e91 commit 31c4fab
File tree
6 files changed
+64
-22
lines changed- sql/core/src
- main/scala/org/apache/spark/sql/execution/datasources
- csv
- json
- text
- test/scala/org/apache/spark/sql/execution/datasources
- csv
- json
- text
6 files changed
+64
-22
lines changedLines changed: 13 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
169 | 169 | | |
170 | 170 | | |
171 | 171 | | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
177 | 182 | | |
178 | | - | |
| 183 | + | |
| 184 | + | |
179 | 185 | | |
180 | | - | |
| 186 | + | |
181 | 187 | | |
Lines changed: 10 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
175 | 175 | | |
176 | 176 | | |
177 | 177 | | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
| 178 | + | |
183 | 179 | | |
184 | 180 | | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
185 | 189 | | |
186 | 190 | | |
187 | 191 | | |
188 | 192 | | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
| 193 | + | |
193 | 194 | | |
Lines changed: 12 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
20 | 22 | | |
21 | 23 | | |
22 | 24 | | |
| |||
148 | 150 | | |
149 | 151 | | |
150 | 152 | | |
151 | | - | |
| 153 | + | |
152 | 154 | | |
153 | 155 | | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
154 | 162 | | |
155 | 163 | | |
156 | | - | |
| 164 | + | |
157 | 165 | | |
158 | | - | |
| 166 | + | |
159 | 167 | | |
160 | 168 | | |
161 | 169 | | |
162 | | - | |
| 170 | + | |
163 | 171 | | |
164 | 172 | | |
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1986 | 1986 | | |
1987 | 1987 | | |
1988 | 1988 | | |
| 1989 | + | |
| 1990 | + | |
| 1991 | + | |
| 1992 | + | |
| 1993 | + | |
| 1994 | + | |
| 1995 | + | |
| 1996 | + | |
| 1997 | + | |
1989 | 1998 | | |
Lines changed: 11 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1897 | 1897 | | |
1898 | 1898 | | |
1899 | 1899 | | |
1900 | | - | |
| 1900 | + | |
1901 | 1901 | | |
1902 | 1902 | | |
1903 | 1903 | | |
| |||
1910 | 1910 | | |
1911 | 1911 | | |
1912 | 1912 | | |
1913 | | - | |
| 1913 | + | |
1914 | 1914 | | |
1915 | 1915 | | |
1916 | 1916 | | |
| |||
2555 | 2555 | | |
2556 | 2556 | | |
2557 | 2557 | | |
| 2558 | + | |
| 2559 | + | |
| 2560 | + | |
| 2561 | + | |
| 2562 | + | |
| 2563 | + | |
| 2564 | + | |
| 2565 | + | |
| 2566 | + | |
2558 | 2567 | | |
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
236 | 245 | | |
0 commit comments