Skip to content

RFC: improve efficiency of tablet filtering in go/vt/discovery and topo #16761

Open
@timvaillancourt

Description

RFC Description

Today discovery.Healthcheck (in go/vt/discovery - mainly used by vtgate) supports filtering tablets watched by:

  1. --keyspaces_to_watch (very common)
  2. --tablet-filter on hostname
  3. --tablet-filter-tags on tablet Tags map[string]string tags

Behind the scenes, this filtering happens in the tablet watcher's loadTablets(), but not very efficiently: first all tablets in the cell are grabbed from topo unconditionally, then the optional filtering of tablets occurs. At times this filtering excludes a significant amount of the topo KVs we fetched. More on "why" later

On clusters with 1000s of tablets this becomes a scalability problem for the topology store that has to handle topo Get calls for all of the tablets fetched. In an extreme case, the txthrottler (which uses discovery.Healtcheck to stream tablet stats) opens 1 x topology watcher per cell just to find tablets in it's local shard. Let's say we have a 3 x cell deployment with 1000 tablets per cell and txthrottler is running in a 3-tablet shard: this means txthrottler will be reading 3000 topo KVs frequently just to find it's 2 other members, and this problem grows with number of tablets. In our production the problem is significantly larger

Now why does tablet watcher fetch EVERY tablet from the cell? Today it kind of has to 🤷. Using a Consul topo as an example, tablets are stored by alias in paths named /tablets/<tablet alias>/ and there is no efficient way to grab just 1 x keyspace or shard - you have to read everything

There is a way to mitigate this inefficiency (but not resolve it): --tablet_refresh_known_tablets=false - this causes vtgate to store tablet records it reads forever, which has it's own drawbacks and doesn't resolve the initial inefficient read of tablets for the entire cell

This issue is a RFC/feature request for a more efficient way to fetch topo records for a single shard and/or keyspace. Unfortunately improving the situation likely means a change to the layout of the topo

Some early ideas:

  • "Pointer"/alias KVs - add KVs like /keyspaces/<keyspace>/<shard>/<tablet> that simply "point" to the actual /tablet/<alias> record, kind of like an index
    • This doesn't seem to be a built in feature of most topo stores so it would need to be done at a KV-level.
  • Tablet records are stored in per-keyspace/shard paths. But this would come at the cost of more ListDir operations
  • <your idea here>

🙇

Use Case(s)

Large vitess deployments that use filtering (likely --keyspaces_to_watch) where rate of topo gets is a risk/concern

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions