-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Describe the bug
A simple sum over Int8 column panicked at "attempt to add with overflow".
To Reproduce
#[tokio::test]
async fn csv_query_array_agg_simple() -> Result<()> {
let ctx = SessionContext::new();
register_aggregate_csv(&ctx).await?;
let sql =
"select c2, sum(c3) sum_c3";
let actual = execute_to_batches(&ctx, sql).await;
let expected = vec![
"+--------+",
"| sum_c3 |",
"+--------+",
"| TBD |",
"+--------+",
];
assert_batches_eq!(expected, &actual);
Ok(())
}
Expected behavior
The overflow should return null or wrapping around at the boundary
Additional context
Apache Spark provides two versions of sum: one named Sum that wraps around the boundary, and TrySum that returns null on overflow.
Also, Spark uses several common types as sum result types:
protected lazy val resultType = child.dataType match {
case DecimalType.Fixed(precision, scale) =>
DecimalType.bounded(precision + 10, scale)
case _: IntegralType => LongType
case it: YearMonthIntervalType => it
case it: DayTimeIntervalType => it
case _ => DoubleType
}