-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
It is a common code pattern in DataFusion to downcast an ArrayRef to a concrete arrow array type.
There are actually at least three patterns to do so in the datafusion codebase https://github.com/apache/arrow-datafusion/search?l=Rust&q=downcast
I think this leads to
- Poor user experience (as a DataFusion bug might panic and quit the process, which is impolite)
- less readable code
- less efficient feature development and review (as people have to figure out which of the three patterns to use and they can't just look at what convention is used in the rest of the codebase
Pattern 1: downcast_ref()
let array = array.as_any().downcast_ref::<Date32Array>().unwrap();panics on error
Pattern 2: use as_XXX_array functions from arrow
let array = as_boolean_array(&array);panics on error
Pattern 3: check type and return an error:
let array = array.as_any().downcast_ref::<Date32Array>().ok_or(
DataFusionError::Execution("Cast Date32Array error".to_string()),
)?;Returns an error to the user (a better behavior I think)
Describe the solution you'd like
I would like to use one consistent pattern throughout the codebase. Specifically I propose functions that do the error checking / error message generation:
/// Return `array` cast to `Int32Array` otherwise an error
pub fn as_int32_array(array: &dyn Array) -> Result<&Int32Array, DataFusionError> {
array.as_any().downcast_ref::<Int32Array>().ok_or(
DataFusionError::Execution(format!("Expected a Int32Array, got: ", array.data_type()),
)
}
/// And similar functions for the remaining array types
And then change all dynamic casts in the codebase:
- If they are fallable, to use the new functions that make a nice error message
- If they are infallible / somewhere were passing out the result is hard, use arrow functions
Describe alternatives you've considered
Leave the status quo
Additional context
Inspired by @avantgardnerio 's suggestion https://github.com/apache/arrow-datafusion/pull/3110/files#r945847871 (but this is not the first time it has come up)