Commit 530efe3
committed
[SPARK-7911] [MLLIB] A workaround for VectorUDT serialize (or deserialize) being called multiple times
~~A PythonUDT shouldn't be serialized into external Scala types in PythonRDD. I'm not sure whether this should fix one of the bugs related to SQL UDT/UDF in PySpark.~~
The fix above didn't work. So I added a workaround for this. If a Python UDF is applied to a Python UDT. This will put the Python SQL types as inputs. Still incorrect, but at least it doesn't throw exceptions on the Scala side. davies harsha2010
Author: Xiangrui Meng <meng@databricks.com>
Closes apache#6442 from mengxr/SPARK-7903 and squashes the following commits:
c257d2a [Xiangrui Meng] add a workaround for VectorUDT1 parent 000df2f commit 530efe3
File tree
1 file changed
+14
-5
lines changed- mllib/src/main/scala/org/apache/spark/mllib/linalg
1 file changed
+14
-5
lines changedLines changed: 14 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
176 | 176 | | |
177 | 177 | | |
178 | 178 | | |
179 | | - | |
180 | 179 | | |
181 | 180 | | |
| 181 | + | |
182 | 182 | | |
183 | 183 | | |
184 | 184 | | |
185 | 185 | | |
| 186 | + | |
186 | 187 | | |
| 188 | + | |
187 | 189 | | |
188 | 190 | | |
189 | 191 | | |
190 | 192 | | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
191 | 199 | | |
192 | | - | |
193 | 200 | | |
194 | 201 | | |
195 | 202 | | |
196 | 203 | | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | 204 | | |
201 | 205 | | |
202 | 206 | | |
| |||
211 | 215 | | |
212 | 216 | | |
213 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
214 | 223 | | |
215 | 224 | | |
216 | 225 | | |
| |||
0 commit comments