feat[gpu]: dict take values prim#6101
Conversation
77cf407 to
14cb0dd
Compare
Merging this PR will degrade performance by 29.75%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| 🆕 | WallTime | u32_values_u8_codes[10M] |
N/A | 128.1 µs | N/A |
| 🆕 | WallTime | u32_values_u16_codes[10M] |
N/A | 144.6 µs | N/A |
| 🆕 | WallTime | u64_values_u32_codes[10M] |
N/A | 274.9 µs | N/A |
| 🆕 | WallTime | u64_values_u8_codes[10M] |
N/A | 213.7 µs | N/A |
| ❌ | Simulation | into_canonical_non_nullable[(10000, 100, 0.1)] |
3.8 ms | 4.6 ms | -17.7% |
| ❌ | Simulation | into_canonical_non_nullable[(10000, 100, 0.01)] |
2.2 ms | 3 ms | -27.05% |
| ❌ | Simulation | into_canonical_non_nullable[(10000, 100, 0.0)] |
1.9 ms | 2.7 ms | -29.42% |
| ❌ | Simulation | into_canonical_nullable[(10000, 100, 0.0)] |
4.4 ms | 5.2 ms | -15.62% |
| ❌ | Simulation | canonical_into_non_nullable[(10000, 100, 0.0)] |
1.9 ms | 2.7 ms | -29.75% |
| ❌ | Simulation | canonical_into_non_nullable[(10000, 100, 0.01)] |
2.1 ms | 2.9 ms | -27.39% |
| ⚡ | Simulation | canonical_into_nullable[(10000, 10, 0.0)] |
528.5 µs | 444.1 µs | +19.03% |
| ⚡ | Simulation | canonical_into_nullable[(10000, 100, 0.0)] |
4.9 ms | 4.1 ms | +19.79% |
| ❌ | Simulation | canonical_into_non_nullable[(10000, 100, 0.1)] |
3.7 ms | 4.5 ms | -18.15% |
Comparing ji/dict-array-2 (0c69415) with develop (46708ba)
Footnotes
-
1290 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Pull request was converted to draft
|
@claude review |
|
a |
vortex-cuda/benches/dict_cuda.rs
Outdated
| const BENCH_ARGS: &[(usize, &str)] = &[ | ||
| (10_000_000, "10M"), | ||
| (100_000_000, "100M"), | ||
| (1_000_000_000, "1B"), |
There was a problem hiding this comment.
1b is a very slow run. i'd rather keep 1m and 100k. expected 1b inputs seems rather "synthetic"?
There was a problem hiding this comment.
How slow was it?
There was a problem hiding this comment.
do you see any througput diff between 10, 100, 1000M?
There was a problem hiding this comment.
I did when we rand it before
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Support cuda dict decode for primitive types