Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: maintaining histograms in plan. #7605

Merged
merged 14 commits into from
Oct 25, 2018

Conversation

winoros
Copy link
Member

@winoros winoros commented Sep 4, 2018

What problem does this PR solve?

Proposal about maintaining histogram information in plan's operators.

Check List

Tests

  • No code

Side effects

  • Possible performance regression
  • Increased code complexity

@shenli shenli changed the title proposal: add proposal for maintaining histograms in plan. proposal: maintaining histograms in plan. Sep 5, 2018
@zz-jason
Copy link
Member

zz-jason commented Sep 5, 2018

@CaitinChen PTAL, thanks!

## Background

Currently, TiDB only uses statistics when deciding which physical scan method a table should use. And TiDB only stores simple statistics in the plan structure. But when deciding the join order and considering some other optimization rules, we need more detailed statistics.
So we need to maintain the statistics in the plan structure to get sufficient statistics information to do optimizations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to maintain the statistics in the plan structure to get sufficient statistics information to do optimizations.


For `Sort`, we can just copy children's `statsInfo` without doing any change.

For `Limit`, we can just copy children's `statsInfo` or ignore the histogram information. As you know, its execution logic is based on randomization. Hard to maintain the statistics information after it. But we may use the information before it to do some estimation in some scenarios.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Limit, we can just copy children's statsInfo or ignore the histogram information. As you know, its execution logic is based on randomization. It is hard to maintain the statistics information after Limit. But we may use the information before Limit to do some estimation in some scenarios.


For `Join`, there’re joins as follows:

- Inner join: use histograms to do the row count estimation with the join key condition. Since it won’t have one side filter, we only need to consider the composite filters after considering the join key. We can simply multiply `selectionFactor` if there are other composite filters in our first version of implementation. Since `Selectivity` cannot calculate selectivity of expression that containing multiple column.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> ... selectivity of an expression that contains multiple columns.
or
-> ... selectivity of an expression containing multiple columns.


- One side outer join: It depends on the join keys’ NDV. And we can just use histograms to estimate it if there’re non-join-key filters.

- Semi join: It’s something similar to inner join. But no data expanding occurs. When we maintain the range information. We can get a nearly accurate answer of its row count.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... When we maintain the range information, we can get a nearly accurate answer of its row count
or
... But no data expanding occurs when we maintain the range information. We can get a nearly accurate answer of its row count
??


For `Selection`, just use it to calculate the selectivity.

For `DataSource`, if it’s a non-partitioned table, we just maintain the map. If it’s a partitioned table, we now only store the statistics of each partition So we need to merge them. We’ll need a cache or something else to ensure that we won’t merge them each time we need it, which will consume tooooo much time and memory space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For DataSource, if it’s a non-partitioned table, we just maintain the map. If it’s a partitioned table, we now only store the statistics of each partition. So we need to merge them. We need a cache or something else to ensure that we won’t merge them each time we need it, which will consume tooooo much time and memory space.


### What is the impact of not doing this?

Many cases reported by our customer already prove that we need more accurate statistics to choose a better join order and a proper join algorithm. Only maintaining a number about row count and a slice about ndv is not enough for making that decision.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ndv -> NDV

Note: The terms in an article should be consistent.


## Implementation

First maintain the histogram in `DataSource`. In this step, there will be some changes in the `statistics` package to make it work. It may take a little long time to do this. [PR#7385](https://github.com/pingcap/tidb/pull/7385)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, maintain the histogram in DataSource. In this step, there will be some changes in the statistics package to make it work. It may take a little long time to do this. PR#7385


For `Projection`, change the reflection of the map we maintained.

For `Aggregation`, use the histogram to estimate the ndv of group-by items. If one index cannot cover the group-by item, we’ll multiply the ndv of each group-by column. If the output of `Aggregation` includes group-by columns, we’ll maintain the histogram of them for future use.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ndv -> NDV
"ndv" or "NDV"? Please make your terms consistent.


I’ve looked into Spark. They did nearly the same thing with what I said. They only maintain the max and min values, no `ranges` information. And they don’t have index, so they only maintain the column’s max/min value which make problem far more much easier.

As for Orca and Calcite, I haven’t discovered where they maintain this information. But there’s something about statistics in Orca’s paper. According to the paper, I think they construct new histogram during planning and cache it for not building to often.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I think they construct a new histogram during planning and cache it to avoid building too many times.
or
... I think they construct a new histogram during planning and cache it to avoid repeated building.


And the `expectedCount` we used in physical plan is something same with `Limit`. So the row count modification during physical plan won’t be affected.

After we switch to the cascade like planner. The rule that needs cost to make decision is still a small set of all. And the existence of `Group` can also help us. If we lazily construct the `statsInfo`, this may not be the bottleneck.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we switch to the cascade-like planner. The rule that needs cost to make a decision is still a small set of all.

@winoros
Copy link
Member Author

winoros commented Sep 6, 2018

@CaitinChen I've addressed these comments. PTAL thanks!

Copy link
Contributor

@CaitinChen CaitinChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

## Background

Currently, TiDB only uses statistics when deciding which physical scan method a table should use. And TiDB only stores simple statistics in the plan structure. But when deciding the join order and considering some other optimization rules, we need more detailed statistics.
So we need to maintain the statistics in the plan structure to get sufficient statistics information to do optimizations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can click "View" in the upper right corner. Then you can see that these two paragraphs are combined together.
Next time, when you want to write another paragraph, please break a line~ Then this article can be displayed normally.


For `Aggregation`, we only need to cut off the things which are not in ranges when doing estimation. There is no need to update the ranges information.

For `TopN`, we now have the alibity to maintain histograms of the order-by items.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alibity -> ability


I’ve looked into Spark. They did nearly the same thing with what I said. They only maintain the max and min values, rather than the `ranges` information. And they don’t have the index, so they only maintain the column’s max/min value which make problem much easier to solve.

As for Orca and Calcite, I haven’t discovered where they maintain this information. But there’s something about statistics in Orca’s paper. According to the paper, I think they construct new histogram during planning and cache it to avoid building too many times.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I think they construct a new histogram during planning...
or
... I think they construct new histograms during planning...


And the `expectedCount` we used in physical plan is something same with `Limit`. So the row count modification during physical plan won’t be affected.

After we switch to the cascade-like planner. The rule that needs cost to make decision is still a small set of all. And the existence of `Group` can also help us. If we lazily construct the `statsInfo`, this may not be the bottleneck.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we switch to the cascade-like planner, the rule that needs cost to make a decision...

@@ -0,0 +1,115 @@
# Proposal: Maintain statistics in `Plan`

- Author: Yiding CUI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add the link of your GitHub profile page after your name~

@shenli
Copy link
Member

shenli commented Sep 6, 2018

@CaitinChen PTAL

Copy link
Contributor

@CaitinChen CaitinChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


The new `statsInfo` of `plan` should be something like the following structure:

```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```go


We maintain the histogram in `Projection`, `Selection`, `Join`, `Aggregation`, `Sort`, `Limit` and `DataSource` operators.

For `Sort`, we can just copy children's `statsInfo` without doing any change.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about use a separate section to describe how to maintain statistics for each operators, like:

### `Sort`

### `Limit`

### `Project`

### ...


For `Join`, there’re joins as follows:

- Inner join: use histograms to do the row count estimation with the join key condition. Since it won’t have one side filter, we only need to consider the composite filters after considering the join key. We can simply multiply `selectionFactor` if there are other composite filters in our first version of implementation. Since `Selectivity` cannot calculate selectivity of an expression containing multiple columns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/`Selectivity`/`Selectivity()`/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how the statistics of Join is derived from the statistics of its children?


For `Projection`, change the reflection of the map we maintained.

For `Aggregation`, use the histogram to estimate the NDV of group-by items. If one index cannot cover the group-by item, we’ll multiply the NDV of each group-by column. If the output of `Aggregation` includes group-by columns, we’ll maintain the histogram of them for future use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to maintain the statistics info for the Aggregate operator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not need to do anything. Just use the child is okay.

@winoros
Copy link
Member Author

winoros commented Sep 11, 2018

I'll update this soon.

@winoros
Copy link
Member Author

winoros commented Sep 14, 2018

updated.
It seems that the formulas work well.


We maintain the histogram in `Projection`, `Selection`, `Join`, `Aggregation`, `Sort`, `Limit` and `DataSource` operators.

#### `Sort`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should describe the following things for each operator:

  1. How to maintain and set the the content of the statsInfo struct of the operator based on the statsInfo of it's child operator?
    • how to calculate the ndv slice?
    • how to calculate the histColl slice?
    • how to calculate the rangesOfXXX slice?
    • how to calculate the max/min Values map?
  2. How to estimate the output row count for the operator based on the statsInfo of it's child operator?


Where <img alt="$joinKeySelectivity = \frac{1}{NDV(t1.key)}*\frac{1}{NDV(t2.key)}*ndvAfterJoin$" src="svgs/291c9eb6e8db885402c716ffc3e17a65.png?invert_in_darkmode" align="middle" width="466.6166208pt" height="27.7756545pt"/>.

The `ndvAfterJoin` can be <img alt="$min(NDV(t1.key), NDV(t2.key))$" src="svgs/30df1c648fa9fe43985776847c8dbe60.png?invert_in_darkmode" align="middle" width="248.4423216pt" height="24.657534pt"/> or a more detailed one if we can caculate it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caculate -> calculate

##### One side outer join
It's almost the same as inner join's behavior. But we need to consider two more thing:

- The unmatched row will be filled as `NULL`. This should be calculated in the new histogram. The null count can be caculated when we estimate the matched count bucket by bucket.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caculated -> calculated

It's almost the same as inner join's behavior. But we need to consider two more thing:

- The unmatched row will be filled as `NULL`. This should be calculated in the new histogram. The null count can be caculated when we estimate the matched count bucket by bucket.
- There will be one side filters of the outer table. If the filter is about join key and can be converted to range information, it's can be easily involved when we do the caculation bucket by bucket. Otherwise it's a little hard to deal with it. Don't consider this case currently.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caculation -> calculation

Same with semi join.

#### `Aggregate`
Just read the NDV information from the `statsInfo` to dicide the row count after aggregate. If there's index can fully match the group-by items. We just use its NDV. Otherwise we multiply the ndv of each column(or index that can match part of the group-by item).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dicide -> decide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. We just use -> , we just use?

We can just copy children's `statsInfo` without doing any change. Since the data distribution is not changed.

#### `Limit`
Currently we won't maintain hitogram information for it. But it can be considered in the future.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hitogram -> histogram

<img alt="Step 2" src="./histogram-3.png" width="150pt"/>
</div>

The calculation inside the bucket can be calculated as this formula <img alt="$selecivity=joinKeySelectivity*RowCount(t1)*RowCount(t2)$" src="svgs/35fa60f709be6b9ab8aa9036bd5e7f7f.png?invert_in_darkmode" align="middle" width="476.19356895pt" height="24.657534pt"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selecivity -> selectivity.

#### `Aggregate`
Just read the NDV information from the `statsInfo` to dicide the row count after aggregate. If there's index can fully match the group-by items. We just use its NDV. Otherwise we multiply the ndv of each column(or index that can match part of the group-by item).

If some of the group-by items are also in the select field. We will create new histograms modify the `totalCnt` of each bucket(set it the same with `NDV`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. We will -> , we will?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify the -> by modifying the?

If some of the group-by items are also in the select field. We will create new histograms modify the `totalCnt` of each bucket(set it the same with `NDV`).

#### `Sort`
We can just copy children's `statsInfo` without doing any change. Since the data distribution is not changed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. Since - , since?


This struct will be maintained when we call `deriveStats`.

Currently we don't change the histogram itself during planning. Because it will consume a lot of time and memory space. I’ll try to maintain ranges slice or the max/min value to improve the accuracy of row count estimation instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we change the histogram in this proposal?

docs/design/2018-09-04-histograms-in-plan.md Show resolved Hide resolved
@alivxxx
Copy link
Contributor

alivxxx commented Oct 11, 2018

Actually, we only use the *.png in design/svgs?

@winoros
Copy link
Member Author

winoros commented Oct 11, 2018

@lamxTyler Yes, it can be removed

@zz-jason
Copy link
Member

LGTM

@zz-jason zz-jason added the status/LGT1 Indicates that a PR has LGTM 1. label Oct 18, 2018
@zz-jason
Copy link
Member

@lamxTyler @eurekaka PTAL

@alivxxx
Copy link
Contributor

alivxxx commented Oct 23, 2018

@winoros They are still some comments not addressed.

@winoros
Copy link
Member Author

winoros commented Oct 24, 2018

@lamxTyler So now it can be reviewed.
The conflict will be resolved when addressing new comments.

@alivxxx
Copy link
Contributor

alivxxx commented Oct 25, 2018

Seems the file histogram-3.jpeg and DS_Store are not used.

Copy link
Contributor

@eurekaka eurekaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eurekaka eurekaka added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Oct 25, 2018
Copy link
Contributor

@alivxxx alivxxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alivxxx alivxxx merged commit 1f57184 into pingcap:master Oct 25, 2018
@winoros winoros deleted the histogram-proposal branch October 25, 2018 12:09
iamzhoug37 pushed a commit to iamzhoug37/tidb that referenced this pull request Oct 25, 2018
bugfix fixed pingcap#7518

expression: MySQL compatible current_user function (pingcap#7801)

plan: propagate constant over outer join (pingcap#7794)

- extract `outerCol = const` from join conditions and filter conditions,
  substitute `outerCol` in join conditions with `const`;
- extract `outerCol = innerCol` from join conditions, derive new join
  conditions based on this column equal condition and `outerCol` related
  expressions in join conditions and filter conditions;

util/timeutil: fix data race caused by forgetting set stats lease to 0 (pingcap#7901)

stats: handle ddl event for partition table (pingcap#7903)

plan: implement Operand and Pattern of cascades planner. (pingcap#7910)

planner: not convert to TableDual if empty range is derived from deferred constants (pingcap#7808)

plan: move projEliminate behind aggEliminate (pingcap#7909)

admin: fix admin check table bug of byte compare (pingcap#7887)

* admin: remove reflect deepEqual

stats: fix panic caused by empty histogram (pingcap#7912)

plan: fix panic caused by empty schema of LogicalTableDual (pingcap#7906)

* fix drop view if exist error (pingcap#7833)

executor: refine `explain analyze` (pingcap#7888)

executor: add an variable to compatible with MySQL insert for OGG (pingcap#7863)

expression: maintain `DeferredExpr` in aggressive constant folding. (pingcap#7915)

stats: fix histogram boundaries overflow error (pingcap#7883)

ddl:support the definition of `null` change to `not null` using `alter table` (pingcap#7771)

* ddl:support the definition of null change to not null using alter table

ddl: add check when create table with foreign key. (pingcap#7885)

* ddl: add check when create table with foreign key

planner: eliminate if null on non null column (pingcap#7924)

executor: fix a bug in point get (pingcap#7934)

planner, executor: refine ColumnPrune for LogicalUnionAll (pingcap#7930)

executor: fix panic when limit is too large (pingcap#7936)

ddl: add TiDB version to metrics (pingcap#7902)

stats: limit the length of sample values (pingcap#7931)

vendor: update tipb (pingcap#7893)

planner: support the Group and GroupExpr for the cascades planner (pingcap#7917)

store/tikv: log more information when other err occurs (pingcap#7948)

types: fix date time parse (pingcap#7933)

ddl: just print error message when ddl job is normal to calcel, to eliminate noisy log (pingcap#7875)

stats: update delta info for partition table (pingcap#7947)

explaintest: add explain test for partition pruning (pingcap#7505)

util: move disjoint set to util package (pingcap#7950)

util: add PreAlloc4Row and Insert for Chunk and List (pingcap#7916)

executor: add the slow log for commit (pingcap#7951)

expression: add builtin json_keys (pingcap#7776)

privilege: add USAGE in `show grants` for mysql compatibility (pingcap#7955)

ddl: fix invailid ddl job panic (pingcap#7940)

*: move ast.NewValueExpr to standalone parser_driver package (pingcap#7952)

Make the ast package get rid of the dependency of types.Datum

server: allow cors http request (pingcap#7939)

*: move `Statement` and `RecordSet` from ast to sqlexec package (pingcap#7970)

pr suggestion update

executor/aggfuncs: split unit tests to corresponding file (pingcap#7993)

store/tikv: fix typo (pingcap#7990)

executor, planner: clone proj schema for different children in buildProj4Union (pingcap#7999)

executor: let information_schema be the first database in ShowDatabases (pingcap#7938)

stats: use local feedback for partition table (pingcap#7963)

executor: add unit test for aggfuncs (pingcap#7966)

server: add log for binary execute statement (pingcap#7987)

admin: refine admin check decoder (pingcap#7862)

executor: improve wide table insert & update performance (pingcap#7935)

ddl: fix reassigned partition id in `truncate table` does not take effect (pingcap#7919)

fix reassigned partition id in truncate table does not take effect

add changelog for 2.1.0 rc4 (pingcap#8020)

*: make parser package dependency as small as possible (pingcap#7989)

parser: support `:=` in the `set` syntax (pingcap#8018)

According to MySQL document, `set` use the = assignment operator,
but the := assignment operator is also permitted

stats: garbage collect stats for partition table (pingcap#7962)

docs: add the proposal for the column pool (pingcap#7988)

expression: refine built-in func truncate to support uint arg (pingcap#8000)

stats: support show stats for partition table (pingcap#8023)

stats: update error rate for partition table (pingcap#8022)

stats: fix estimation for out of range point queries (pingcap#8015)

*: move parser to a separate repository (pingcap#8036)

executor: fix wrong result when index join on union scan. (pingcap#8031)

Do not modify Plan of dataReaderBuilder directly, because it would
impact next batch of outer rows, as well as other concurrent inner
workers. Instead, build a local child builder to store the child plan.

planner: fix a panic of a cached prepared statement with IndexScan (pingcap#8017)

*: fix the issue of executing DDL after executing SQL failure in txn (pingcap#8044)

* ddl, executor: fix the issue of executing DDL after executing SQL failure in txn

add unit test

remove debug info

add like evaluator case sensitive test

ddl, domain: make schema correct after canceling jobs (pingcap#7997)

unit test fix

code format

proposal: maintaining histograms in plan. (pingcap#7605)

support _tidb_rowid for table scan range (pingcap#8047)

var rename fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/docs proposal sig/planner SIG: Planner status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants