-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plan: add cache for statistics table #2398
Conversation
if txn == nil { | ||
return statistics.PseudoTable(table) | ||
} | ||
m := meta.NewMeta(txn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think move cache refill logic to statistics
package is better.
And If the the stats for a table is always nil, we don't want to get it from kv every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, but it will introduce circle dependency, because meta/meta.go
imports statistics
package. And if stats table is nil, I will save a pseduo one in cache, thus we will not get it from kv every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache can be implemented in another package.
like statscache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plan/statscache
or plan/statistics/statscache
?
var expireDuration int64 = 60 * 60 * 1000 | ||
|
||
func tableCacheExpired(tbl *Table) bool { | ||
duration := oracle.GetPhysical(time.Now()) - oracle.ExtractPhysical(uint64(tbl.TS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbl.TS
is the stats build time, table stats may not updated frequently, we should use the last get time to determine if the stats is expired.
statistics.SetStatisticsTableCache(table.ID, tbl) | ||
return statistics.PseudoTable(table) | ||
} | ||
tbl.TS = int64(txn.StartTS()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not use the build time field to record lookup time.
if ok && tblInfo == si.tbl.Info && !tableCacheExpired(si) { | ||
return si.tbl | ||
} | ||
txn := ctx.Txn() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During plan building, the ctx.Txn
is null most of the time.
If it is nil, call ctx.ActivePendingTxn
to get the transaction from transaction channel.
} | ||
|
||
// SetStatisticsTableCache sets the statistics table cache. | ||
func SetStatisticsTableCache(id int64, statsTbl *statistics.Table, ts uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We always compare ts
to time.Now
, so why not just use time.Time
, then this parameter can be removed.
LGTM |
statsTblCache.m.RUnlock() | ||
// Here we check the TableInfo because there may be some ddl changes in the duration period. | ||
// Also, we rely on the fact that TableInfo will not be same if and only if there are ddl changes. | ||
if ok && tblInfo == si.tbl.Info && !tableCacheExpired(si) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test when the table cache is expired
Add a test, compare the plan result before or after we use |
tk.MustExec("analyze table t1") | ||
result = tk.MustQuery("explain select * from t1 where t1.a = 1") | ||
rowStr = fmt.Sprintf("%s", result.Rows()) | ||
c.Check(strings.Split(rowStr, "{")[0], Equals, "[[TableScan_4 ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean you can add the cases like following:
index(a) is (1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2) and filter is a =1
index(b) is (1,2,3,4,5,6......15) and filter is b = 1.
Then optimizer will choose index b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can add these tests in plan test, init the store can execute sql stmt in test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good test case, but I don't think you will get the desired result, yet. Because currently, the logic for index columns is, somewhat, designed for pseudo cases, which in this case will always return table.Count / 3
. After the two PR is merged, this test case will be testable. I will add it in the next PR. Seems next
PR has lots of things to do...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I ignored it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PTAL @shenli @coocood @hanfei1991 @tiancaiamao @winoros