Skip to content

Commit 67df575

Browse files
evenyagnicecui
authored andcommitted
docs: add docs for the merge_mode option (#1061)
Co-authored-by: Yiran <cuiyiran3@gmail.com>
1 parent 48c429a commit 67df575

File tree

4 files changed

+131
-2
lines changed

4 files changed

+131
-2
lines changed

docs/nightly/en/reference/sql/compatibility.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ GreptimeDB supports a subset of ANSI SQL and has some unique extensions. Some ma
99
2. Insert data: Consistent with ANSI SQL syntax, but requires the `TIME INDEX` column value (or default value) to be provided.
1010
3. Update data: Does not support `UPDATE` syntax, but if the primary key and `TIME INDEX` corresponding column values are the same during `INSERT`, subsequent inserted rows will overwrite previously written rows, effectively achieving an update.
1111
* Since 0.8, GreptimeDB supports [append mode](/reference/sql/create#create-an-append-only-table) that creates an append-only table with `append_mode="true"` option which keeps duplicate rows.
12+
* GreptimeDB supports [merge mode](/reference/sql/create#create-an-append-only-table) that creates a table with `merge_mode="last_non_null"` option which allow updating a field partially.
1213
4. Query data: Query syntax is compatible with ANSI SQL, with some functional differences and omissions.
1314
* Does not support views.
1415
* TQL syntax extension: Supports executing PromQL in SQL via TQL subcommands. Please refer to the [TQL](./tql.md) section for details.

docs/nightly/en/reference/sql/create.md

Lines changed: 63 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,8 @@ Users can add table options by using `WITH`. The valid options contain the follo
9898
| `compaction.twcs.max_inactive_window_files` | Max num of files that can be kept in inactive time window. | String value, such as '1'. Only available when `compaction.type` is `twcs`. |
9999
| `compaction.twcs.time_window` | Compaction time window | String value, such as '1d' for 1 day. The table usually partitions rows into different time windows by their timestamps. Only available when `compaction.type` is `twcs`. |
100100
| `memtable.type` | Type of the memtable. | String value, supports `time_series`, `partition_tree`. |
101-
| `append_mode` | Whether the table is append-only | String value. Default is 'false', which removes duplicate rows by primary keys and timestamps. Setting it to 'true' to enable append mode and create an append-only table which keeps duplicate rows. |
101+
| `append_mode` | Whether the table is append-only | String value. Default is 'false', which removes duplicate rows by primary keys and timestamps according to the `merge_mode`. Setting it to 'true' to enable append mode and create an append-only table which keeps duplicate rows. |
102+
| `merge_mode` | The strategy to merge duplicate rows | String value. Only available when `append_mode` is 'false'. Default is `last_row`, which keeps the last row for the same primary key and timestamp. Setting it to `last_non_null` to keep the last non-null field for the same primary key and timestamp. |
102103
| `comment` | Table level comment | String value. |
103104

104105
#### Create a table with TTL
@@ -149,6 +150,67 @@ CREATE TABLE IF NOT EXISTS temperatures(
149150
) engine=mito with('append_mode'='true');
150151
```
151152

153+
#### Create a table with merge mode
154+
Create a table with `last_row` merge mode, which is the default merge mode.
155+
```sql
156+
create table if not exists metrics(
157+
host string,
158+
ts timestamp,
159+
cpu double,
160+
memory double,
161+
TIME INDEX (ts),
162+
PRIMARY KEY(host)
163+
)
164+
engine=mito
165+
with('merge_mode'='last_row');
166+
```
167+
168+
Under `last_row` mode, the table merges rows with the same primary key and timestamp by only keeping the latest row.
169+
```sql
170+
INSERT INTO metrics VALUES ('host1', 0, 0, NULL), ('host2', 1, NULL, 1);
171+
INSERT INTO metrics VALUES ('host1', 0, NULL, 10), ('host2', 1, 11, NULL);
172+
173+
SELECT * from metrics ORDER BY host, ts;
174+
175+
+-------+-------------------------+------+--------+
176+
| host | ts | cpu | memory |
177+
+-------+-------------------------+------+--------+
178+
| host1 | 1970-01-01T00:00:00 | | 10.0 |
179+
| host2 | 1970-01-01T00:00:00.001 | 11.0 | |
180+
+-------+-------------------------+------+--------+
181+
```
182+
183+
184+
Create a table with `last_non_null` merge mode.
185+
```sql
186+
create table if not exists metrics(
187+
host string,
188+
ts timestamp,
189+
cpu double,
190+
memory double,
191+
TIME INDEX (ts),
192+
PRIMARY KEY(host)
193+
)
194+
engine=mito
195+
with('merge_mode'='last_non_null');
196+
```
197+
198+
Under `last_non_null` mode, the table merges rows with the same primary key and timestamp by keeping the latest value of each field.
199+
```sql
200+
INSERT INTO metrics VALUES ('host1', 0, 0, NULL), ('host2', 1, NULL, 1);
201+
INSERT INTO metrics VALUES ('host1', 0, NULL, 10), ('host2', 1, 11, NULL);
202+
203+
SELECT * from metrics ORDER BY host, ts;
204+
205+
+-------+-------------------------+------+--------+
206+
| host | ts | cpu | memory |
207+
+-------+-------------------------+------+--------+
208+
| host1 | 1970-01-01T00:00:00 | 0.0 | 10.0 |
209+
| host2 | 1970-01-01T00:00:00.001 | 11.0 | 1.0 |
210+
+-------+-------------------------+------+--------+
211+
```
212+
213+
152214
### Column options
153215

154216
GreptimeDB supports the following column options:

docs/nightly/zh/reference/sql/compatibility.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ GreptimeDB 支持的 SQL 是 ANSI SQL 的子集,并且拥有一些特有的扩
99
2. 插入新数据: 与 ANSI SQL 语法一致,但是强制要求提供 `TIME INDEX` 列值(或默认值)。
1010
3. 更新:不支持 `UPDATE` 语法,但是在 `INSERT` 的时候,如果主键和 `TIME INDEX` 对应的列值一样,那么后续插入的行将覆盖以前写入的行,从而变相实现更新。
1111
* 从 0.8 开始, GreptimeDB 支持 [append 模式](/reference/sql/create#创建-Append-Only-表),创建时指定`append_mode = "true"` 选项的表将保留重复的数据行。
12+
* GreptimeDB 支持 [merge 模式](/reference/sql/create#create-an-append-only-table),该模式使用 `merge_mode="last_non_null"` 选项创建表,允许部分更新字段。
1213
4. 查询:查询语法兼容 ANSI SQL,存在部分功能差异和缺失
1314
* 不支持视图
1415
* TQL 语法扩展:TQL 子命令支持在 SQL 中执行 PromQL,详细请参考 [TQL](./tql.md) 一节。

docs/nightly/zh/reference/sql/create.md

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,8 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name
100100
| `compaction.twcs.max_inactive_window_files` | 非活跃时间窗口内的最大文件数 | 字符串值,如 '1'。只在 `compaction.type``twcs` 时可用 |
101101
| `compaction.twcs.time_window` | Compaction 时间窗口 | 字符串值,如 '1d' 表示 1 天。该表会根据时间戳将数据分区到不同的时间窗口中。只在 `compaction.type``twcs` 时可用 |
102102
| `memtable.type` | memtable 的类型 | 字符串值,支持 `time_series``partition_tree` |
103-
| `append_mode` | 该表是否时 append-only 的 | 字符串值. 默认为 'false',表示会根据主键和时间戳对数据去重。设置为 'true' 可以开启 append 模式和创建 append-only 表,保留所有重复的行 |
103+
| `append_mode` | 该表是否时 append-only 的 | 字符串值。默认值为 'false',根据 'merge_mode' 按主键和时间戳删除重复行。设置为 'true' 可以开启 append 模式和创建 append-only 表,保留所有重复的行 |
104+
| `merge_mode` | 合并重复行的策略 | 字符串值。只有当 `append_mode` 为 'false' 时可用。默认值为 `last_row`,保留相同主键和时间戳的最后一行。设置为 `last_non_null` 则保留相同主键和时间戳的最后一个非空字段。 |
104105
| `comment` | 表级注释 | 字符串值. |
105106

106107
#### 创建指定 TTL 的表
@@ -151,6 +152,70 @@ CREATE TABLE IF NOT EXISTS temperatures(
151152
) engine=mito with('append_mode'='true');
152153
```
153154

155+
#### 创建带有 merge 模式的表
156+
157+
创建一个带有 `last_row` merge 模式的表,这是默认的 merge 模式。
158+
159+
```sql
160+
create table if not exists metrics(
161+
host string,
162+
ts timestamp,
163+
cpu double,
164+
memory double,
165+
TIME INDEX (ts),
166+
PRIMARY KEY(host)
167+
)
168+
engine=mito
169+
with('merge_mode'='last_row');
170+
```
171+
172+
`last_row` 模式下,表会通过保留最新的行来合并具有相同主键和时间戳的行。
173+
174+
```sql
175+
INSERT INTO metrics VALUES ('host1', 0, 0, NULL), ('host2', 1, NULL, 1);
176+
INSERT INTO metrics VALUES ('host1', 0, NULL, 10), ('host2', 1, 11, NULL);
177+
178+
SELECT * from metrics ORDER BY host, ts;
179+
180+
+-------+-------------------------+------+--------+
181+
| host | ts | cpu | memory |
182+
+-------+-------------------------+------+--------+
183+
| host1 | 1970-01-01T00:00:00 | | 10.0 |
184+
| host2 | 1970-01-01T00:00:00.001 | 11.0 | |
185+
+-------+-------------------------+------+--------+
186+
```
187+
188+
189+
创建带有 `last_non_null` merge 模式的表。
190+
191+
```sql
192+
create table if not exists metrics(
193+
host string,
194+
ts timestamp,
195+
cpu double,
196+
memory double,
197+
TIME INDEX (ts),
198+
PRIMARY KEY(host)
199+
)
200+
engine=mito
201+
with('merge_mode'='last_non_null');
202+
```
203+
204+
`last_non_null` 模式下,表会通过保留每个字段的最新值来合并具有相同主键和时间戳的行。
205+
206+
```sql
207+
INSERT INTO metrics VALUES ('host1', 0, 0, NULL), ('host2', 1, NULL, 1);
208+
INSERT INTO metrics VALUES ('host1', 0, NULL, 10), ('host2', 1, 11, NULL);
209+
210+
SELECT * from metrics ORDER BY host, ts;
211+
212+
+-------+-------------------------+------+--------+
213+
| host | ts | cpu | memory |
214+
+-------+-------------------------+------+--------+
215+
| host1 | 1970-01-01T00:00:00 | 0.0 | 10.0 |
216+
| host2 | 1970-01-01T00:00:00.001 | 11.0 | 1.0 |
217+
+-------+-------------------------+------+--------+
218+
```
154219

155220
### 列选项
156221

0 commit comments

Comments
 (0)