Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analyze use MaxUint64 ts to read data #35233

Closed
xuyifangreeneyes opened this issue Jun 8, 2022 · 2 comments · Fixed by #35232
Closed

analyze use MaxUint64 ts to read data #35233

xuyifangreeneyes opened this issue Jun 8, 2022 · 2 comments · Fixed by #35232
Labels
affects-6.1 sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@xuyifangreeneyes
Copy link
Contributor

Enhancement

#24575 makes analyze read data on snapshot. Combined with incremental update of modify_count and count at the end of analyze, we can get a more accurate modify_count especially when lots of updates happen during the long time analyze. However, long-time snapshot analyze can throw error GC life time is shorter than transaction duration(#29862) or block GC(#35062). Considering analyze doesn't require strong data consistency, we hope to change back to use MaxUint64 ts to read data in analyze.

@xuyifangreeneyes xuyifangreeneyes added the type/enhancement The issue or PR belongs to an enhancement. label Jun 8, 2022
@xhebox
Copy link
Contributor

xhebox commented Jun 22, 2022

But the original issue of #24575 has concerns on overestimation of modify_count, maybe add another TiDBAnalyzeVersion?

@xuyifangreeneyes
Copy link
Contributor Author

xuyifangreeneyes commented Jun 22, 2022

#24575

Yes. We switch back to read the latest data rather than certain snapshot because analyze does't require strong data consistency and analyze on snapshot may bring some problems(auto analyze blocks gc or long-time auto analyze fails) more severe than inaccurate modify_count. When we use MaxUint64 ts to read data in analyze, there are two ways to update modify_count when analyze is finished. The first way is to update it incrementally(see #24720), which causes overestimation of modify_count. The second way is to just set it 0, which is the original implementation and causes underestimation of modify_count. Underestimation makes auto analyze fail to trigger when it should be triggered and has more risk than overestimation on cardinality/cost estimation. Thus we choose to use MaxUint64 ts in analyze and update modify_count incrementally. The doc has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.1 sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants