Enhancement
This enhancement has been proposed in #959. This issue explains why we want to do it and what it can resolve.
One user reported that they found the pd CPU spike with large data in TiSpark v3.2.1.
- The metric shows that a large number of
getrgion RPC are requested in PD.
- The metric shows that the pd CPU is high only at the begin 10 min of spark job.
According to the reports, I think TiSpark may call too much getrgion in the driver when it split the region task. We can use ScanRegions in driver to reduce to number of RPC call.
TODO:Test the effect of this enhancement