Skip to content

[BUG] Loading a missing column from a Parquet file results in ArrayIndexOutOfBoundsException #11278

Open
@jlowe

Description

Describe the bug
Attempting to load a non-existent column from a Parquet file throws ArrayIndexOutOfBoundsException instead of a more specific error.

Steps/Code to reproduce bug
Try to load a single column name that does not exist from a Parquet file. For example:

Table.readParquet(ParquetOptions.builder().includeColumn("doesnotexist").build(), new java.io.File("data.parquet"))

throws the following exception:

java.lang.ArrayIndexOutOfBoundsException: 0
  at ai.rapids.cudf.Table.<init>(Table.java:96)
  at ai.rapids.cudf.Table.readParquet(Table.java:974)
  ... 49 elided

Expected behavior
A more useful error exception/message should be used than an array index error.

Metadata

Assignees

No one assigned

    Labels

    JavaAffects Java cuDF API.SparkFunctionality that helps Spark RAPIDSbugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions