Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add doc for schema cache #18646

Merged
merged 22 commits into from
Oct 15, 2024
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -1026,6 +1026,7 @@
- [`schema_unused_indexes`](/sys-schema/sys-schema-unused-indexes.md)
- [元数据锁](/metadata-lock.md)
- [TiDB 加速建表](/accelerated-table-creation.md)
- [schema 缓存](/schema-cache.md)
lilin90 marked this conversation as resolved.
Show resolved Hide resolved
- UI
- TiDB Dashboard
- [简介](/dashboard/dashboard-intro.md)
Expand Down
39 changes: 39 additions & 0 deletions schema-cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: schema 缓存
lilin90 marked this conversation as resolved.
Show resolved Hide resolved
aliases: ['/docs-cn/dev/information-schema-cache']
summary: TiDB 对于 schema 信息采用基于 LRU 的缓存机制,在大量数据库和表的场景下能够显著减少 schema 信息的内存占用以及提高性能。
---

# schema 缓存
lilin90 marked this conversation as resolved.
Show resolved Hide resolved

在一些多租户的场景下,可能会存在几十万甚至上百万个数据库和表。这些数据库和表的 schema 信息如果全部加载到内存中,一方面会占用大量的内存,另一方面会导致相关的访问性能变差。为了解决这个问题,TiDB 引入了类似于 LRU 的 schema 缓存机制。只将最近用到的数据库和表的 schema 信息缓存到内存中。

> **警告:**
>
> 该功能目前为实验特性,不建议在生产环境中使用。该功能可能会在未事先通知的情况下发生变化或删除。如果发现 bug,请在 GitHub 上提 [issue](https://github.com/pingcap/tidb/issues) 反馈。

wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved
## 配置
lilin90 marked this conversation as resolved.
Show resolved Hide resolved

可以通过配置系统变量 [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-从-v800-版本开始引入) 来打开 schema 缓存特性。
wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved

## 最佳实践

- 在大量数据库和表的场景下(例如 10 万以上的数据库和表数量)或者当数据库和表的数量大到影响系统性能时,建议打开 schema 缓存特性。
- 可以通过观测 TiDB 监控中 **Schema load** 下的子面板 **Infoschema v2 Cache Operation** 来查看 schema 缓存的命中率。如果命中率较低,可以调大 [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-从-v800-版本开始引入)。
- 可以通过观测 TiDB 监控中 **Schema load** 下的子面板 **Infoschema v2 Cache Size** 来查看当前使用的 schema 缓存的大小。
- 建议关闭 [`performance.force-init-stats`](/tidb-configuration-file.md#force-init-stats-从-v657-和-v710-版本开始引入) 以减少 TiDB 的启动时间。
- 如果需要创建大量的表(例如 10 万张以上),建议将参数 [`split-table`](/tidb-configuration-file.md#split-table) 设置为 `false` 以减少 Region 数量,从而降低 TiKV 的内存。

## 已知限制

在大量数据库和表的场景下,有以下已知问题:
Frank945946 marked this conversation as resolved.
Show resolved Hide resolved

wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved
- 当需要被访问的表是没有规律的,如 t1 访问一批表,t2 访问另外一批表,而且设置的 tidb_schema cache size 较小时,会导这些 schema 信息被频繁地被逐出,频繁地被缓存,造成性能抖动。该特性比较适合被频繁访问的库和表是相对固定的场景。
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- 统计信息不一定能够及时收集。
- 一些元数据信息的访问会变慢。
- 切换 schema 缓存开关需要等待一段时间。
- 全量列举所有元数据信息的相关操作会变慢,如:

- `SHOW FULL TABLES`
- `FLASHBACK`
- `ALTER TABLE ... SET TIFLASH MODE ...`