Commit 5388bbb
feat: Add percentile_cont aggregate function (apache#17988)
## Summary
Adds exact `percentile_cont` aggregate function as the counterpart to
the existing `approx_percentile_cont` function.
## What changes were made?
### New Implementation
- Created `percentile_cont.rs` with full implementation
- `PercentileCont` struct implementing `AggregateUDFImpl`
- `PercentileContAccumulator` for standard aggregation
- `DistinctPercentileContAccumulator` for DISTINCT mode
- `PercentileContGroupsAccumulator` for efficient grouped aggregation
- `calculate_percentile` function with linear interpolation
### Features
- **Exact calculation**: Stores all values in memory for precise results
- **WITHIN GROUP syntax**: Supports `WITHIN GROUP (ORDER BY ...)`
- **Interpolation**: Uses linear interpolation between values
- **All numeric types**: Works with integers, floats, and decimals
- **Ordered-set aggregate**: Properly marked as
`is_ordered_set_aggregate()`
- **GROUP BY support**: Efficient grouped aggregation via
GroupsAccumulator
### Tests
Added comprehensive tests in `aggregate.slt`:
- Error conditions validation
- Basic percentile calculations (0.0, 0.25, 0.5, 0.75, 1.0)
- Comparison with `median` function
- Ascending and descending order
- GROUP BY aggregation
- NULL handling
- Edge cases (empty sets, single values)
- Float interpolation
- Various numeric data types
## Example Usage
```sql
-- Basic usage with WITHIN GROUP syntax
SELECT percentile_cont(0.75) WITHIN GROUP (ORDER BY column_name)
FROM table_name;
-- With GROUP BY
SELECT category, percentile_cont(0.95) WITHIN GROUP (ORDER BY value)
FROM sales
GROUP BY category;
-- Compare with median (percentile_cont(0.5) == median)
SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY price) FROM products;
```
## Performance Considerations
Like `median`, this function stores all values in memory before
computing results. For large datasets or when approximation is
acceptable, use `approx_percentile_cont` instead.
## Related Issues
Closes apache#6714
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude <noreply@anthropic.com>1 parent fd07e73 commit 5388bbb
File tree
7 files changed
+1294
-50
lines changed- datafusion
- functions-aggregate/src
- sqllogictest/test_files
- docs/source/user-guide/sql
7 files changed
+1294
-50
lines changedLines changed: 19 additions & 46 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
35 | | - | |
| 34 | + | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
43 | | - | |
| 42 | + | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
| 51 | + | |
| 52 | + | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| |||
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
167 | | - | |
| 169 | + | |
| 170 | + | |
168 | 171 | | |
169 | 172 | | |
170 | 173 | | |
| |||
214 | 217 | | |
215 | 218 | | |
216 | 219 | | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
252 | | - | |
253 | 220 | | |
254 | | - | |
255 | | - | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
256 | 229 | | |
257 | 230 | | |
258 | 231 | | |
| |||
262 | 235 | | |
263 | 236 | | |
264 | 237 | | |
265 | | - | |
| 238 | + | |
266 | 239 | | |
267 | 240 | | |
268 | 241 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
| 84 | + | |
84 | 85 | | |
85 | 86 | | |
86 | 87 | | |
87 | 88 | | |
88 | 89 | | |
89 | 90 | | |
90 | 91 | | |
| 92 | + | |
91 | 93 | | |
92 | 94 | | |
93 | 95 | | |
| |||
123 | 125 | | |
124 | 126 | | |
125 | 127 | | |
| 128 | + | |
126 | 129 | | |
127 | 130 | | |
128 | 131 | | |
| |||
171 | 174 | | |
172 | 175 | | |
173 | 176 | | |
| 177 | + | |
174 | 178 | | |
175 | 179 | | |
176 | 180 | | |
| |||
0 commit comments