planner: NDV estimation formula is too naive #54812
Labels
affects-5.4
This bug affects 5.4.x versions.
affects-6.1
affects-6.5
affects-7.1
affects-7.5
affects-8.1
epic/cardinality-estimation
the optimizer cardinality estimation
report/customer
Customers have encountered this bug.
sig/planner
SIG: Planner
type/enhancement
The issue or PR belongs to an enhancement.
Enhancement
Now our NDV estimation formula is too naive, for example, for the query
select distinct(a) from t where b=1
, we need to estimateNDV(a)
, and now our formula isNDV(a) = NDV_a_from_stats * Selectivity(b=1)
, which might cause a large estimation error.Below is a concrete case:
Then for the query above, to estimate
NDV(a)
, we useNDV_a_from_stats * Selectivity(b=1) = 101 * 100/1100 = 9.18
, but the actual result is100
:Another case is for Join Estimation:
select * from t1, t2 where t1.a=t2.a and t2.b=1
, for this case we might get a wrong estimation result ofNDV(t2.a)
and then finally get a wrong Join Plan.The text was updated successfully, but these errors were encountered: