You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-16764][SQL] Recommend disabling vectorized parquet reader on OutOfMemoryError
## What changes were proposed in this pull request?
We currently don't bound or manage the data array size used by column vectors in the vectorized reader (they're just bound by INT.MAX) which may lead to OOMs while reading data. As a short term fix, this patch intercepts the OutOfMemoryError exception and suggest the user to disable the vectorized parquet reader.
## How was this patch tested?
Existing Tests
Author: Sameer Agarwal <sameerag@cs.berkeley.edu>
Closes#14387 from sameeragarwal/oom.
(cherry picked from commit 3fd39b8)
Signed-off-by: Reynold Xin <rxin@databricks.com>
0 commit comments