-
-
Notifications
You must be signed in to change notification settings - Fork 262
PERCENTILE_CONT and PERCENTILE_DISC functions #8807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
The current implementation has one drawback. According to the description, the I was able to implement a check for the constancy of the For examples: WITH
T(N, P) AS (
SELECT 1, 0.5 FROM RDB$DATABASE
UNION ALL
SELECT 2, 0.5 FROM RDB$DATABASE
UNION ALL
SELECT 3, 1 FROM RDB$DATABASE
)
SELECT
PERCENTILE_DISC(P) WITHIN GROUP(ORDER BY N)
FROM T;This query fails with the error:
But this query is correct because the percentile is included in the WITH
T(N, P) AS (
SELECT 1, 0.5 FROM RDB$DATABASE
UNION ALL
SELECT 2, 0.5 FROM RDB$DATABASE
UNION ALL
SELECT 3, 1 FROM RDB$DATABASE
)
SELECT
PERCENTILE_DISC(P) WITHIN GROUP(ORDER BY N)
FROM T
GROUP BY P;The same applies to using these functions as a window. However, this check doesn't currently work. Incorrect examlpe: WITH
T(N, P) AS (
SELECT 1, 0.5 FROM RDB$DATABASE
UNION ALL
SELECT 2, 0.5 FROM RDB$DATABASE
UNION ALL
SELECT 3, 1 FROM RDB$DATABASE
)
SELECT
N, P,
PERCENTILE_DISC(P) WITHIN GROUP(ORDER BY N) OVER()
FROM T;Correct example: WITH
T(N, P) AS (
SELECT 1, 0.5 FROM RDB$DATABASE
UNION ALL
SELECT 2, 0.5 FROM RDB$DATABASE
UNION ALL
SELECT 3, 1 FROM RDB$DATABASE
)
SELECT
N, P,
PERCENTILE_DISC(P) WITHIN GROUP(ORDER BY N) OVER(PARTITION BY P)
FROM T;I couldn't find a way to check if a certain field is inside the |
… This prevents crashes for files >= 1KB but < PAGE_SIZE.
…L#8816) * More accurate (methinks) calculation of the average page fill factor * Correction
…o the new datatypes (FirebirdSQL#8815) * Add new 128-bit types to the record layout optimization attempted by gbak * Given the backup file already contains fields in the optimized order, insist on it and prevent the engine from generating field IDs in a different order. This restores the original record layout optimization accidentally broken by my commit #2ed48a6.
…gh-8817 This should fix bug FirebirdSQL#8817 : Fatal lock manager error: invalid lock id
…nded up number. This is how it works on Linux since v3. And this fixes validation of database file < one full page in size.
…tendable statistics (FirebirdSQL#8808) * Add support of grouping page-level I/O counters per pagespace * Add support for per-pagespace I/O statistics. Deprecate non-extendable PerformanceInfo struct in favor of the new PerformanceCounters/PerformanceStats interfaces. Adjust the trace implementation to the new API. * Better names for interface methods. Add the basic docs. Get rid of the separate global counters. Misc renaming. * Add the docs weirdly escaped from the last commit * Follow Dimitry Sibiryakov's suggestion to unify get*Counters methods. * Rename the method
…ebirdSQL#8812) * Fixes a loop in the GENERATE_SERIES function on boundary values. * Correction according to dyemanov * A more complete solution for taking into account boundary values * Refactoring calculation with different types. ArithmeticNode::add is now used as agreed upon by @asfernandes. * Fixed non-ASCII character in variable name * Adjusting indents
…ARAMETER command on functions in packages
Regression of FirebirdSQL#8082 reported in 2025-05-16
…ntax conflicts
…ion are not restored in Firebird 6.0 (FirebirdSQL#8839) * Fix FirebirdSQL#8822: Some procedures containing LIST aggregate function are not restored in Firebird 6.0 * Add comment as requested by Adriano
|
Please adjust the patch using the new BLR verb introduced in #8839. |
…to percentile-functions
|
I seem to have messed up the rebase a bit. Now I can't figure out how to make only my commits visible. |
done |
PERCENTILE_DISC and PERCENTILE_CONT functions
The
PERCENTILE_CONTandPERCENTILE_DISCfunctions are known as inverse distribution functions.These functions operate on an ordered set. Both functions can be used as aggregate or window functions.
PERCENTILE_DISC
PERCENTILE_DISCis an inverse distribution function that assumes a discrete distribution model.It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.
Syntax for the
PERCENTILE_DISCfunction as an aggregate function.Syntax for the
PERCENTILE_DISCfunction as an window function.The first argument
<percent>must evaluate to a numeric value between 0 and 1, because it is a percentile value.This expression must be constant within each aggregate group.
The
ORDER BYclause takes a single expression that can be of any type that can be sorted.The function
PERCENTILE_DISCreturns a value of the same type as the argument inORDER BY.For a given percentile value
P,PERCENTILE_DISCsorts the values of the expression in theORDER BYclause andreturns the value with the smallest
CUME_DISTvalue (with respect to the same sort specification)that is greater than or equal to
P.Analytic Example
PERCENTILE_CONT
PERCENTILE_CONTis an inverse distribution function that assumes a continuous distribution model.It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.
Syntax for the
PERCENTILE_CONTfunction as an aggregate function.Syntax for the
PERCENTILE_CONTfunction as an window function.The first argument
<percent>must evaluate to a numeric value between 0 and 1, because it is a percentile value.This expression must be constant within each aggregate group.
The
ORDER BYclause takes a single expression, which must be of numeric type to perform interpolation.The
PERCENTILE_CONTfunction returns a value of typeDOUBLE PRECISIONorDECFLOAT(34)depending on the typeof the argument in the
ORDER BYclause. A value of typeDECFLOAT(34)is returned ifORDER BYcontainsan expression of one of the types
INT128,NUMERIC(38, x)orDECFLOAT(16 | 34), otherwise -DOUBLE PRECISION.The result of
PERCENTILE_CONTis computed by linear interpolation between values after ordering them.Using the percentile value (
P) and the number of rows (N) in the aggregation group, you can computethe row number you are interested in after ordering the rows with respect to the sort specification.
This row number (
RN) is computed according to the formulaRN = (1 + (P * (N - 1)).The final result of the aggregate function is computed by linear interpolation between the values from rows
at row numbers
CRN = CEILING(RN)andFRN = FLOOR(RN).Analytic Example
An example of using both aggregate functions