planner: regression index merge UNION case from v7.3.0 to v7.4.0 and master #48588

AilinKid · 2023-11-14T08:53:19Z

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE `t` (
  `id` int(11) NOT NULL,
  `aid` bigint(20) DEFAULT NULL,
  `c1` varchar(255) DEFAULT NULL,
  `c2` varchar(255) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `aid_c1` (`aid`,`c1`),
  KEY `aid_c2` (`aid`,`c2`)
)

2. What did you expect to see? (Required)

MySQL [test]> explain select /*+ USE_INDEX_MERGE(t, aid_c1, aid_c2) */ * from t where (aid = 1 and c1='aaa') or (aid = 1 and c2='bbb') limit 1;
+--------------------------------+---------+-----------+--------------------------------+---------------------------------------------------------+
| id                             | estRows | task      | access object                  | operator info                                           |
+--------------------------------+---------+-----------+--------------------------------+---------------------------------------------------------+
| IndexMerge_20                  | 1.00    | root      |                                | type: union, limit embedded(offset:0, count:1)          |
| ├─Limit_18(Build)              | 0.01    | cop[tikv] |                                | offset:0, count:1                                       |
| │ └─IndexRangeScan_11          | 0.01    | cop[tikv] | table:t, index:aid_c1(aid, c1) | range:[1 "aaa",1 "aaa"], keep order:false, stats:pseudo |
| ├─Limit_19(Build)              | 0.01    | cop[tikv] |                                | offset:0, count:1                                       |
| │ └─IndexRangeScan_12          | 0.01    | cop[tikv] | table:t, index:aid_c2(aid, c2) | range:[1 "bbb",1 "bbb"], keep order:false, stats:pseudo |
| └─TableRowIDScan_13(Probe)     | 1.00    | cop[tikv] | table:t                        | keep order:false, stats:pseudo                          |
+--------------------------------+---------+-----------+--------------------------------+---------------------------------------------------------+
6 rows in set (0.001 sec)

3. What did you see instead (Required)

MySQL [test]> explain select /*+ USE_INDEX_MERGE(t, aid_c1, aid_c2) */ * from t where (aid = 1 and c1='aaa') or (aid = 1 and c2='bbb') limit 1;
+----------------------------------+---------+-----------+--------------------------------+---------------------------------------------------------+
| id                               | estRows | task      | access object                  | operator info                                           |
+----------------------------------+---------+-----------+--------------------------------+---------------------------------------------------------+
| Limit_10                         | 1.00    | root      |                                | offset:0, count:1                                       |
| └─IndexMerge_23                  | 1.00    | root      |                                | type: union                                             |
|   ├─IndexRangeScan_20(Build)     | 0.01    | cop[tikv] | table:t, index:aid_c1(aid, c1) | range:[1 "aaa",1 "aaa"], keep order:false, stats:pseudo |
|   ├─IndexRangeScan_21(Build)     | 0.01    | cop[tikv] | table:t, index:aid_c2(aid, c2) | range:[1 "bbb",1 "bbb"], keep order:false, stats:pseudo |
|   └─TableRowIDScan_22(Probe)     | 1.00    | cop[tikv] | table:t                        | keep order:false, stats:pseudo                          |
+----------------------------------+---------+-----------+--------------------------------+---------------------------------------------------------+
5 rows in set (0.001 sec)

4. What is your TiDB version? (Required)

The text was updated successfully, but these errors were encountered:

AilinKid · 2023-11-14T09:37:06Z

#46619 in this pr we lift the limitation of generate valid cop task for intersection index merge case.

Explaination:

First:

condition (aid = 1 and c1='aaa') or (aid = 1 and c2='bbb') can be seen as AND condition combination after we fillIndexPath is done:
index(aid, c2) with access condition: (aid=1) or (aid=1 and c2='bbb') , with entire OR tree as its table filter
index(aid, c1) with access condition: (aid=1 and c1='aaa') or (aid), with entire OR tree as its table filter

this kind of condition matching and converging into different index path is done automatically by fillIndexPath, not from manually cnf and dnf convertion

Note that, the final Index Merge is AND/Intersection mode. The alternative path for OR/UNION is also generated.
Let's identify them and COP-Intersection and COP-Union for short here.

Second:

the cost compare problem, in the function convertToIndexMergeScan in previous pr #46619, we lift the limitation of generating COP-TYPE index merge, its is usefully for we to sink limit into it before encapsulating it as a root task. While in the cost compare function getTaskPlanCost of compareTaskCost, seems we lost the logic of handling COP-TYPE index merge operator's cost collection. for root index merge reader, we actually did.

func getTaskPlanCost(t task, op *physicalOptimizeOp) (float64, bool, error) {
	if t.invalid() {
		return math.MaxFloat64, true, nil
	}

	// use the new cost interface
	var taskType property.TaskType
	switch t.(type) {
	case *rootTask:
		taskType = property.RootTaskType
	case *copTask: // no need to know whether the task is single-read or double-read, so both CopSingleReadTaskType and CopDoubleReadTaskType are OK
		cop := t.(*copTask)
		if cop.indexPlan != nil && cop.tablePlan != nil { // handle IndexLookup specially
			taskType = property.CopMultiReadTaskType
			// keep compatible with the old cost interface, for CopMultiReadTask, the cost is idxCost + tblCost.
			if !cop.indexPlanFinished { // only consider index cost in this case
				idxCost, err := getPlanCost(cop.indexPlan, taskType, NewDefaultPlanCostOption().WithOptimizeTracer(op))
				return idxCost, false, err
			}
			// consider both sides
			idxCost, err := getPlanCost(cop.indexPlan, taskType, NewDefaultPlanCostOption().WithOptimizeTracer(op))
			if err != nil {
				return 0, false, err
			}
			tblCost, err := getPlanCost(cop.tablePlan, taskType, NewDefaultPlanCostOption().WithOptimizeTracer(op))
			if err != nil {
				return 0, false, err
			}
			return idxCost + tblCost, false, nil
		}

		taskType = property.CopSingleReadTaskType

		// TiFlash can run cop task as well, check whether this cop task will run on TiKV or TiFlash.
		if cop.tablePlan != nil {
			leafNode := cop.tablePlan
			for len(leafNode.Children()) > 0 {
				leafNode = leafNode.Children()[0]
			}
			if tblScan, isScan := leafNode.(*PhysicalTableScan); isScan && tblScan.StoreType == kv.TiFlash {
				taskType = property.MppTaskType
			}
		}
	case *mppTask:
		taskType = property.MppTaskType
	default:
		return 0, false, errors.New("unknown task type")
	}
///////////////////////////// BUG1 ////////////////////////////////////////////
	if t.plan() == nil {
		// It's a very special case for index merge case.
		cost := 0.0
		copTsk := t.(*copTask)
		for _, partialScan := range copTsk.idxMergePartPlans {
			partialCost, err := getPlanCost(partialScan, taskType, NewDefaultPlanCostOption().WithOptimizeTracer(op))
			if err != nil {
				return 0, false, err
			}
			cost += partialCost
		}
		return cost, false, nil
	}
///////////////////////////// BUG2 ////////////////////////////////////////////
	cost, err := getPlanCost(t.plan(), taskType, NewDefaultPlanCostOption().WithOptimizeTracer(op))
	return cost, false, err
}

COP-Intersection's table side is not pure table scan, it has selection of entire OR tree, so we mark indexPlanFinished as true in convertToIndexMergeScan, kind of tricky code here. And in the comment the second bug line above, it only collected the cost from t.plan(). When the indexPlanFinished is true in current cop task, t.plan() only return table plan, leading missing the collection of index merge partial plan.

On the contrary, COP-Union's table side is pure table scan, so we mark indexPlanFinished as false in convertToIndexMergeScan, And in the comment the first bug line above, it only collected the cost from t.plan(). When the indexPlanFinished is true in current cop task, t.plan() only return index plan which is nil in index merge, consequently walking through the branch code, and only collecting the partial index plans cost.

The direct result will cause COP-Intersection win the COP-UNION in current ds.convertToIndexMergeScan()

Third:

so for COP type, the datasource only memorize the COP-Intersection as it's best, for ROOT type, the datasource give the plan which has limit append on the index merge reader which is seen in the non-expected plan shown in the pr description above.

…rge reader (#48590) close #48588

…rge reader (#48590) (#48632) close #48588

AilinKid added the type/bug The issue is confirmed as a bug. label Nov 14, 2023

AilinKid self-assigned this Nov 14, 2023

AilinKid mentioned this issue Nov 14, 2023

planner: fix index merge union case couldn't sink limit into index merge reader #48590

Merged

13 tasks

AilinKid changed the title ~~planner: downgrade index merge UNION case from v7.3.0 to v7.4.0 and master~~ planner: regression index merge UNION case from v7.3.0 to v7.4.0 and master Nov 14, 2023

AilinKid added sig/planner SIG: Planner affects-7.3 affects-7.5 affects-7.4 and removed affects-7.3 labels Nov 14, 2023

winoros added the severity/moderate label Nov 14, 2023

ti-chi-bot bot closed this as completed in #48590 Nov 16, 2023

ti-chi-bot bot pushed a commit that referenced this issue Nov 16, 2023

planner: fix index merge union case couldn't sink limit into index me…

5960d0d

…rge reader (#48590) close #48588

ti-chi-bot mentioned this issue Nov 16, 2023

planner: fix index merge union case couldn't sink limit into index merge reader (#48590) #48632

Merged

13 tasks

ti-chi-bot bot pushed a commit that referenced this issue May 28, 2024

planner: fix index merge union case couldn't sink limit into index me…

bcb49ae

…rge reader (#48590) (#48632) close #48588

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

planner: regression index merge UNION case from v7.3.0 to v7.4.0 and master #48588

planner: regression index merge UNION case from v7.3.0 to v7.4.0 and master #48588

AilinKid commented Nov 14, 2023 •

edited

Loading

AilinKid commented Nov 14, 2023 •

edited

Loading

planner: regression index merge UNION case from v7.3.0 to v7.4.0 and master #48588

planner: regression index merge UNION case from v7.3.0 to v7.4.0 and master #48588

Comments

AilinKid commented Nov 14, 2023 • edited Loading

Bug Report

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

AilinKid commented Nov 14, 2023 • edited Loading

First:

Second:

Third:

AilinKid commented Nov 14, 2023 •

edited

Loading

AilinKid commented Nov 14, 2023 •

edited

Loading