constant array metadata for small scalars#6363
Conversation
Signed-off-by: Onur Satici <onur@spiraldb.com>
Merging this PR will improve performance by 10.65%
Performance Changes
Comparing Footnotes
|
Benchmarks: PolarSignals ProfilingSummary
Detailed Results Table
|
Benchmarks: FineWeb NVMeSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on S3Summary
Detailed Results Table
|
Benchmarks: TPC-DS SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb S3Summary
Detailed Results Table
|
Benchmarks: Statistical and Population GeneticsSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on S3Summary
Detailed Results Table
|
| // Prefer reading the scalar from inlined metadata to avoid device-to-host copies. | ||
| let sv = if let Some(ref proto_bytes) = metadata.scalar_value { | ||
| ScalarValue::from_protobytes(proto_bytes)? | ||
| } else { | ||
| if buffers.len() != 1 { | ||
| vortex_bail!("Expected 1 buffer, got {}", buffers.len()); | ||
| } | ||
| let buffer = buffers[0].clone().try_to_host_sync()?; | ||
| ScalarValue::from_protobytes(&buffer)? |
There was a problem hiding this comment.
so we duplicate them?
There was a problem hiding this comment.
for small constants yea, to keep forward compat
There was a problem hiding this comment.
also if not and we need the scalar in the device we would need to issue a h2d copy anyway, these values are small enough to just duplicate I think
There was a problem hiding this comment.
does this ever actually get triggered?
There was a problem hiding this comment.
I don't think it does for clickbench or tpch
There was a problem hiding this comment.
but we do have constant arrays and they can't be read to the gpu without an extra host roundtrip, so constantarray::build is definitely triggered on a clickbench scan
There was a problem hiding this comment.
can we find out what scalar this is
Benchmarks: Clickbench on NVMESummary
Detailed Results Table
|
Changes the metadata for `ConstantArray` to `Option<Scalar>`. #6363 added functionality for inlining the constant scalar if it was small enough in the array metadata. This change adds to that by using a scalar directly for this. It is optional since we do not always want to do this. Tests incoming! Also, I think it is worth considering if we should _not_ write the scalars to the buffers if we make this optimization. As is, this is doing double work. --------- Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
scalar value would be stored in the metadata for small scalars as well as in the buffer, this would help decoding a constant array in the gpu without forcing the buffer value to host, and gpu kernels can still use the constant value from the buffer