Samplerate could be dynamically adjusted or make the num samples also usable. #46485
Labels
component/statistics
sig/planner
SIG: Planner
type/enhancement
The issue or PR belongs to an enhancement.
Enhancement
After #37193 and #35232, the analyze statement has used max uint64 to retrieve data, thus avoiding blocking GC. However, this change has also caused some issues. The calculation of samplerate is based on the start of analyze. If the application keeps writing data during the analyze process, the amount of data in the table may increase significantly. As a result, the initial samplerate becomes too large, leading to excessive sampling and causing memory/CPU pressure in TiDB.
It is best to have a feedback mechanism that can dynamically adjust the default samplerate as the data volume increases, or make the "num samples" option available. Currently, it is not recommended for use according to the documentation.
The text was updated successfully, but these errors were encountered: