[YSQL] Primary Key Selectivity is inaccurate for when the filter condition contains equality on hash key and inequality on range key

Jira Link: [DB-2677](https://yugabyte.atlassian.net/browse/DB-2677)
### Description

Consider the tables `temp1` and `temp2` with the following schemas. 

```
yugabyte=# \d temp1
               Table "public.temp1"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 hk     | integer |           | not null |
 rk     | integer |           | not null |
Indexes:
    "temp1_pkey" PRIMARY KEY, lsm (hk HASH, rk ASC)

yugabyte=# \d temp2
               Table "public.temp2"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 hk     | integer |           | not null |
 rk     | integer |           | not null |
Indexes:
    "temp2_pkey" PRIMARY KEY, lsm (hk HASH, rk ASC)
```

The tables have been populated with the following insert commands. 
```
INSERT INTO temp1  SELECT i%100, i FROM (SELECT generate_series(1, 10000) i) t; 
INSERT INTO temp2  SELECT i/100, i FROM (SELECT generate_series(1, 10000) i) t;
```

Hence the contents of the table temp1 looks like this
```
 hk |  rk
----+-------
  1 |     1
  2 |     2
  3 |     3
  4 |     4
  5 |     5
  ...
 98 |    98
 99 |    99
  0 |   100
  1 |   101
  2 |   102
  3 |   103
  ...
  ...
 95 |  9995
 96 |  9996
 97 |  9997
 98 |  9998
 99 |  9999
  0 | 10000
```
while the contents of table temp2 looks like this
```
 hk  |  rk
-----+-------
   0 |     1
   0 |     2
   0 |     3
   0 |     4
  ....
   1 |   100
   1 |   101
   1 |   102
   1 |   103
  ...
  ...
   2 |   200
   2 |   201
   2 |   202
   2 |   203
   ...
   ...
   ...
  99 |  9996
  99 |  9997
  99 |  9998
  99 |  9999
 100 | 10000
```
While executing queries with a PK index lookup, selectivities are pretty similar in both situations. And hence the number of rows estimated to be returned is the same for both situations. This is because we multiply the selectivities on the hash component and range component and use them to estimate the number of rows. This needs to be changed. 
```
yugabyte=# EXPLAIN ANALYZE SELECT * FROM temp1 where hk = 1 AND rk < 200;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Index Scan using temp1_pkey on temp1  (cost=0.00..4.23 rows=2 width=8) (actual time=0.415..0.417 rows=2 loops=1)
   Index Cond: ((hk = 1) AND (rk < 200))
 Planning Time: 1.937 ms
 Execution Time: 0.438 ms
(4 rows)

yugabyte=# EXPLAIN ANALYZE SELECT * FROM temp2 where hk = 1 AND rk < 200;
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Index Scan using temp2_pkey on temp2  (cost=0.00..4.23 rows=2 width=8) (actual time=1.167..1.202 rows=100 loops=1)
   Index Cond: ((hk = 1) AND (rk < 200))
 Planning Time: 2.422 ms
 Execution Time: 1.230 ms
(4 rows)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[YSQL] Primary Key Selectivity is inaccurate for when the filter condition contains equality on hash key and inequality on range key #12941

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[YSQL] Primary Key Selectivity is inaccurate for when the filter condition contains equality on hash key and inequality on range key #12941

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions