Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer choose full table scan when there is a better execution path #56321

Open
mzhang77 opened this issue Sep 25, 2024 · 7 comments · May be fixed by #56483
Open

Optimizer choose full table scan when there is a better execution path #56321

mzhang77 opened this issue Sep 25, 2024 · 7 comments · May be fixed by #56483
Labels
severity/major sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@mzhang77
Copy link

mzhang77 commented Sep 25, 2024

Bug Report

1. Minimal reproduce step (Required)

mysql> show create table d\G
*************************** 1. row ***************************
       Table: d
Create Table: CREATE TABLE `d` (
  `mid` bigint(20) unsigned NOT NULL,
  `object_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `ov` longblob DEFAULT NULL,
  `version` int(11) unsigned NOT NULL,
  PRIMARY KEY (`object_id`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci AUTO_INCREMENT=23924932 /*T![auto_id_cache] AUTO_ID_CACHE=1 */
1 row in set (0.00 sec)
mysql> show create table i\G
*************************** 1. row ***************************
       Table: i
Create Table: CREATE TABLE `i` (
  `object_id` bigint(20) unsigned NOT NULL,
  `lid` bigint(20) DEFAULT NULL,
  `sid` varbinary(767) DEFAULT NULL,
  `cid` bigint(20) DEFAULT NULL,
  `version` int(11) unsigned NOT NULL,
  PRIMARY KEY (`object_id`) /*T![clustered_index] CLUSTERED */,
  KEY `i_1` (`cid`,`sid`,`object_id`),
  KEY `i_2` (`lid`,`object_id`),
  KEY `i_3` (`cid`,`object_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
1 row in set (0.00 sec)

mysql> 
explain SELECT
       `d`.*
     FROM
       `i`
       LEFT JOIN `d` ON `i`.`object_id` = `d`.`object_id`
     WHERE
       `i`.`cid` = 249
     ORDER BY
       `i`.`object_id`
     LIMIT 1000 OFFSET 18000;

2. What did you expect to see? (Required)

The execution plan should use table i as driving table to access table d

3. What did you see instead (Required)

The problem is captured by slow log, it is not reproducible any more.
Screenshot 2024-09-25 at 10 55 59 AM

The issue with this plan is, there is no need to full table scan d. Optimizer should offset 18000 rows from i, take the next 1000 rows, and join them to table d. So maybe full table scan i is necessary, but full table scan d to do a merge join is unnecessary.

4. What is your TiDB version? (Required)

v7.5.1

@mzhang77 mzhang77 added the type/bug The issue is confirmed as a bug. label Sep 25, 2024
@mzhang77
Copy link
Author

I'm able to reproduce the problem using this script to generate test data: https://gist.github.com/mzhang77/a7cb08949d8473edbde4a27bde0d3530

@winoros
Copy link
Member

winoros commented Sep 27, 2024

Cost for default choice MergeJoin

mysql>  explain format='cost_trace' SELECT  `d`.* FROM   `i`   LEFT JOIN `d` ON `i`.`object_id` = `d`.`object_id` WHERE   `i`.`cid` = 249 ORDER BY   `i`.`object_id` LIMIT 1000 OFFSET 18000;
+-----------------------------------+----------+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+------------------------------------------------------------------------+
| id                                | estRows  | estCost    | costFormula                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | task      | access object                      | operator info                                                          |
+-----------------------------------+----------+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+------------------------------------------------------------------------+
| Projection_12                     | 1000.00  | 1417546.02 | (((((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (((scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))) + (net(30000*rowsize(67.00999999999999)*tidb_kv_net_factor(3.96))))/15.00) + ((cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (cpu(30000*filters(0)*tidb_cpu_factor(49.9)))) + ((group(19000*cols(0.01)*tidb_cpu_factor(49.9))) + (group(30000*cols(0.01)*tidb_cpu_factor(49.9)))))) + ((cpu(1000*filters(0.04)*tidb_cpu_factor(49.9)))/5.00) | root      |                                    | test.d.mid, test.d.object_id, test.d.ov, test.d.version                |
| └─Limit_19                        | 1000.00  | 1417146.82 | ((((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (((scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))) + (net(30000*rowsize(67.00999999999999)*tidb_kv_net_factor(3.96))))/15.00) + ((cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (cpu(30000*filters(0)*tidb_cpu_factor(49.9)))) + ((group(19000*cols(0.01)*tidb_cpu_factor(49.9))) + (group(30000*cols(0.01)*tidb_cpu_factor(49.9)))))                                                            | root      |                                    | offset:18000, count:1000                                               |
|   └─MergeJoin_67                  | 19000.00 | 1417146.82 | (((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (((scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))) + (net(30000*rowsize(67.00999999999999)*tidb_kv_net_factor(3.96))))/15.00) + ((cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (cpu(30000*filters(0)*tidb_cpu_factor(49.9)))) + ((group(19000*cols(0.01)*tidb_cpu_factor(49.9))) + (group(30000*cols(0.01)*tidb_cpu_factor(49.9))))                                                              | root      |                                    | left outer join, left key:test.i.object_id, right key:test.d.object_id |
|     ├─TableReader_53(Build)       | 30000.00 | 1024516.39 | ((scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))) + (net(30000*rowsize(67.00999999999999)*tidb_kv_net_factor(3.96))))/15.00                                                                                                                                                                                                                                                                                                                                                                                                | root      |                                    | data:TableFullScan_52                                                  |
|     │ └─TableFullScan_52          | 30000.00 | 7406957.80 | scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | cop[tikv] | table:d                            | keep order:true                                                        |
|     └─Limit_36(Probe)             | 19000.00 | 368179.43  | ((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)                                                                                                                                                                                                                                                                                                                                                                                                                          | root      |                                    | offset:0, count:19000                                                  |
|       └─IndexReader_49            | 19000.00 | 368179.43  | (((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00                                                                                                                                                                                                                                                                                                                                                                                                                            | root      |                                    | index:Limit_48                                                         |
|         └─Limit_48                | 19000.00 | 4318851.50 | (scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | cop[tikv] |                                    | offset:0, count:19000                                                  |
|           └─IndexRangeScan_47     | 19000.00 | 4318851.50 | scan(19000*logrowsize(48)*tikv_scan_factor(40.7))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | cop[tikv] | table:i, index:i_3(cid, object_id) | range:[249,249], keep order:true                                       |
+-----------------------------------+----------+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+------------------------------------------------------------------------+

Cost for index join choice

mysql>  explain format='cost_trace' SELECT /*+ INL_JOIN(d) */   `d`.* FROM   `i`   LEFT JOIN `d` ON `i`.`object_id` = `d`.`object_id` WHERE   `i`.`cid` = 249 ORDER BY   `i`.`object_id` LIMIT 1
000 
+-------------------------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                  | estRows  | estCost     | costFormula                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | task      | access object                      | operator info                                                                                                                                    |
+-------------------------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_11                       | 1000.00  | 10556981.64 | (((cpu(10*3*tidb_cpu_factor(49.9))) + (((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (cpu(19000*10*tidb_cpu_factor(49.9))) + ((() + (((((scan(1*logrowsize(540)*tikv_scan_factor(40.7))) + (net(1*rowsize(540)*tidb_kv_net_factor(3.96))))/15.00)*19000.00)/6.00) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + ((hashkey(19000*0*tidb_cpu_factor(49.9))) + (hashmem(19000*540*tidb_mem_factor(0.2))) + (hashbuild(19000*tidb_cpu_factor(49.9)))))/5.00))) + ((cpu(1000*filters(0.04)*tidb_cpu_factor(49.9)))/5.00) | root      |                                    | test.d.mid, test.d.object_id, test.d.ov, test.d.version                                                                                          |
| └─Limit_18                          | 1000.00  | 10556582.44 | ((cpu(10*3*tidb_cpu_factor(49.9))) + (((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (cpu(19000*10*tidb_cpu_factor(49.9))) + ((() + (((((scan(1*logrowsize(540)*tikv_scan_factor(40.7))) + (net(1*rowsize(540)*tidb_kv_net_factor(3.96))))/15.00)*19000.00)/6.00) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + ((hashkey(19000*0*tidb_cpu_factor(49.9))) + (hashmem(19000*540*tidb_mem_factor(0.2))) + (hashbuild(19000*tidb_cpu_factor(49.9)))))/5.00))                                                            | root      |                                    | offset:18000, count:1000                                                                                                                         |
|   └─IndexJoin_54                    | 19000.00 | 10556582.44 | (cpu(10*3*tidb_cpu_factor(49.9))) + (((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (cpu(19000*10*tidb_cpu_factor(49.9))) + ((() + (((((scan(1*logrowsize(540)*tikv_scan_factor(40.7))) + (net(1*rowsize(540)*tidb_kv_net_factor(3.96))))/15.00)*19000.00)/6.00) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + ((hashkey(19000*0*tidb_cpu_factor(49.9))) + (hashmem(19000*540*tidb_mem_factor(0.2))) + (hashbuild(19000*tidb_cpu_factor(49.9)))))/5.00)                                                              | root      |                                    | left outer join, inner:TableReader_51, outer key:test.i.object_id, inner key:test.d.object_id, equal cond:eq(test.i.object_id, test.d.object_id) |
|     ├─Limit_63(Build)               | 19000.00 | 368179.43   | ((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | root      |                                    | offset:0, count:19000                                                                                                                            |
|     │ └─IndexReader_69              | 19000.00 | 368179.43   | (((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | root      |                                    | index:Limit_68                                                                                                                                   |
|     │   └─Limit_68                  | 19000.00 | 4318851.50  | (scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | cop[tikv] |                                    | offset:0, count:19000                                                                                                                            |
|     │     └─IndexRangeScan_44       | 19000.00 | 4318851.50  | scan(19000*logrowsize(48)*tikv_scan_factor(40.7))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | cop[tikv] | table:i, index:i_3(cid, object_id) | range:[249,249], keep order:true                                                                                                                 |
|     └─TableReader_51(Probe)         | 19000.00 | 167.19      | ((scan(1*logrowsize(540)*tikv_scan_factor(40.7))) + (net(1*rowsize(540)*tidb_kv_net_factor(3.96))))/15.00                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | root      |                                    | data:TableRangeScan_50                                                                                                                           |
|       └─TableRangeScan_50           | 19000.00 | 369.43      | scan(1*logrowsize(540)*tikv_scan_factor(40.7))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | cop[tikv] | table:d                            | range: decided by [test.i.object_id], keep order:false                                                                                           |
+-------------------------------------+----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec)

Cost for hash join choice

mysql>  explain format='cost_trace' SELECT /*+ HASH_JOIN(d) */  `d`.* FROM   `i`   LEFT JOIN `d` ON `i`.`object_id` = `d`.`object_id` WHERE   `i`.`cid` = 249 ORDER BY   `i`.`object_id` LIMIT 1000 OFFSET 18000;
+-------------------------------------+----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+-----------------------------------------------------------------+
| id                                  | estRows  | estCost     | costFormula                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | task      | access object                      | operator info                                                   |
+-------------------------------------+----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+-----------------------------------------------------------------+
| Projection_11                       | 1000.00  | 19508812.18 | (((cpu(10*3*tidb_cpu_factor(49.9))) + (((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (((scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))) + (net(30000*rowsize(67.00999999999999)*tidb_kv_net_factor(3.96))))/15.00) + ((hashkey(19000*1*tidb_cpu_factor(49.9))) + (hashmem(19000*16*tidb_mem_factor(0.2))) + (hashbuild(19000*tidb_cpu_factor(49.9)))) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (((cpu(30000*filters(0)*tidb_cpu_factor(49.9))) + ((hashkey(30000*1*tidb_cpu_factor(49.9))) + (hashprobe(30000*tidb_cpu_factor(49.9)))))/5.00)) + ((exprCPU(19000*0*tidb_cpu_factor(49.9))) + (orderCPU(19000*log(19000)*tidb_cpu_factor(49.9)))) + (topMem(19000*548*tidb_mem_factor(0.2)))) + ((cpu(1000*filters(0.04)*tidb_cpu_factor(49.9)))/5.00) | root      |                                    | test.d.mid, test.d.object_id, test.d.ov, test.d.version         |
| └─TopN_14                           | 1000.00  | 19508412.98 | ((cpu(10*3*tidb_cpu_factor(49.9))) + (((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (((scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))) + (net(30000*rowsize(67.00999999999999)*tidb_kv_net_factor(3.96))))/15.00) + ((hashkey(19000*1*tidb_cpu_factor(49.9))) + (hashmem(19000*16*tidb_mem_factor(0.2))) + (hashbuild(19000*tidb_cpu_factor(49.9)))) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (((cpu(30000*filters(0)*tidb_cpu_factor(49.9))) + ((hashkey(30000*1*tidb_cpu_factor(49.9))) + (hashprobe(30000*tidb_cpu_factor(49.9)))))/5.00)) + ((exprCPU(19000*0*tidb_cpu_factor(49.9))) + (orderCPU(19000*log(19000)*tidb_cpu_factor(49.9)))) + (topMem(19000*548*tidb_mem_factor(0.2)))                                                            | root      |                                    | test.i.object_id, offset:18000, count:1000                      |
|   └─HashJoin_30                     | 19000.00 | 3949992.82  | (cpu(10*3*tidb_cpu_factor(49.9))) + (((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)) + (((scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))) + (net(30000*rowsize(67.00999999999999)*tidb_kv_net_factor(3.96))))/15.00) + ((hashkey(19000*1*tidb_cpu_factor(49.9))) + (hashmem(19000*16*tidb_mem_factor(0.2))) + (hashbuild(19000*tidb_cpu_factor(49.9)))) + (cpu(19000*filters(0)*tidb_cpu_factor(49.9))) + (((cpu(30000*filters(0)*tidb_cpu_factor(49.9))) + ((hashkey(30000*1*tidb_cpu_factor(49.9))) + (hashprobe(30000*tidb_cpu_factor(49.9)))))/5.00)                                                                                                                                                                                                           | root      |                                    | left outer join, equal:[eq(test.i.object_id, test.d.object_id)] |
|     ├─Limit_35(Build)               | 19000.00 | 368179.43   | ((((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | root      |                                    | offset:0, count:19000                                           |
|     │ └─IndexReader_48              | 19000.00 | 368179.43   | (((scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))) + (net(19000*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | root      |                                    | index:Limit_47                                                  |
|     │   └─Limit_47                  | 19000.00 | 4318851.50  | (scan(19000*logrowsize(48)*tikv_scan_factor(40.7)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | cop[tikv] |                                    | offset:0, count:19000                                           |
|     │     └─IndexRangeScan_46       | 19000.00 | 4318851.50  | scan(19000*logrowsize(48)*tikv_scan_factor(40.7))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | cop[tikv] | table:i, index:i_3(cid, object_id) | range:[249,249], keep order:true                                |
|     └─TableReader_52(Probe)         | 30000.00 | 1024516.39  | ((scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))) + (net(30000*rowsize(67.00999999999999)*tidb_kv_net_factor(3.96))))/15.00                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | root      |                                    | data:TableFullScan_51                                           |
|       └─TableFullScan_51            | 30000.00 | 7406957.80  | scan(30000*logrowsize(67.00999999999999)*tikv_scan_factor(40.7))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | cop[tikv] | table:d                            | keep order:false                                                |
+-------------------------------------+----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------------------+-----------------------------------------------------------------+
9 rows in set (0.00 sec)

@mzhang77
Copy link
Author

For INL_JOIN, table range scan of d should have estRows of 1000, instead of 19000
Screenshot 2024-09-27 at 4 06 34 PM

@winoros
Copy link
Member

winoros commented Sep 27, 2024

For INL_JOIN, table range scan of d should have estRows of 1000, instead of 19000 Screenshot 2024-09-27 at 4 06 34 PM

@mzhang77
Oh, I got your point.

From the explain result, when we push a top-n down to the outer join's outer child. We'll remove the offset and make a full TopN
image
Maybe this part can be improved. We'll check it later.

@winoros
Copy link
Member

winoros commented Sep 27, 2024

@mzhang77
Some explanation about why we remove the offset when pushing it down:
The original TopN is applied after the join. One row from the outer join's outer child might match more than one row from the inner side. So we can only push down a TopN(0, 19000) to the outer child. We don't know which row from the outer side will produce the 1001-th join result.
Due to this reason, we also don't know how many rows the inner side needs output to make the join output the 19000-th join result.

So currently, both sides will use 19000 rows to calculate the cost.

Oh, but I noticed that your join key from the inner side is the primary key, so this can ensure each match of the outer side is exactly one. So the total execution logic can be improved maybe.

If the join key doesn't have the unique property, i'm afraid that the execution logic and the estimation logic will not change.

@mzhang77
Copy link
Author

@winoros This is a real case from a paid customer. Actually in their application all join key has unique property.

@winoros
Copy link
Member

winoros commented Sep 28, 2024

@winoros This is a real case from a paid customer. Actually in their application all join key has unique property.

i know. i just want to explain the current situation and the possible improvment tidb can do.

@winoros winoros added type/enhancement The issue or PR belongs to an enhancement. and removed type/bug The issue is confirmed as a bug. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 may-affects-7.1 may-affects-7.5 may-affects-8.1 labels Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/major sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants