Skip to content

Using PD's ScanRegions API to resolve pd cpu spike #2708

@shiyuhang0

Description

@shiyuhang0

Enhancement

This enhancement has been proposed in #959. This issue explains why we want to do it and what it can resolve.

One user reported that they found the pd CPU spike with large data in TiSpark v3.2.1.

  1. The metric shows that a large number of getrgion RPC are requested in PD.
  2. The metric shows that the pd CPU is high only at the begin 10 min of spark job.

According to the reports, I think TiSpark may call too much getrgion in the driver when it split the region task. We can use ScanRegions in driver to reduce to number of RPC call.

TODO:Test the effect of this enhancement

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions