Skip to content

Commit

Permalink
docs(reference): update data-types (GreptimeTeam#629)
Browse files Browse the repository at this point in the history
Co-authored-by: Yiran <cuiyiran3@gmail.com>
  • Loading branch information
QuenKar and nicecui authored Oct 12, 2023
1 parent ca94130 commit b734b3f
Show file tree
Hide file tree
Showing 2 changed files with 169 additions and 79 deletions.
131 changes: 89 additions & 42 deletions docs/en/v0.4/reference/data-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,57 +2,104 @@

GreptimeDB supports the following data types:

| Type name | Description | Synonyms | Size |
| Type name | Description | Aliases | Size |
|:-:|:-:| :-:| :-:|
|`tinyint`| 8-bit signed small integers between -128~127|| 1 Byte |
|`smallint`| 16-bit signed big integers between -32768~32767 | |2 Bytes |
|`int`| 32-bit signed integers between -2147483648~2147483647| `integer`| 4 Bytes |
|`bigint`| 64-bit signed big integers between -9223372036854775808~9223372036854775807| | 8 Bytes |
|`varchar`|UTF-8 encoded strings|`text`<br />/`string`<br />/ `char`| The length of the strings |
|`float`|32-bit IEEE754 floating point values || 4 Bytes |
|`double`|Double precision IEEE 754 floating point values|| 8 Bytes |
|`boolean`|Boolean values|| 1 Byte |
|`varbinary`|Variable length binary values| | The length of the data + 2 bytes|
|`date`|32-bit date values|| 4 Bytes |
|`datetime`|64-bit datetime values|| 8 Bytes |
|`timestamp[(0/3/6/9)]`|64-bit timestamp values with optional precision. <br /> For example, `timestamp(0)` represents timestamp type with seconds precision, `timestamp(3)` represents milliseconds precision, `timestamp(6)` for microseonds and `timestamp(9)` for nanoseconds. If no precision is given, the timestamp is in **milliseconds** precision by default.|| 8 Bytes |
|`tinyint`| -128 ~ 127|`Int8`| 1 Byte |
|`smallint`| -32768 ~ 32767 | `Int16`|2 Bytes |
|`int`| -2147483648 ~ 2147483647| `Int32`| 4 Bytes |
|`bigint`| -9223372036854775808 ~ 9223372036854775807| `Int64` | 8 Bytes |
|`varchar`|UTF-8 encoded strings|`Text`<br />/`String`<br />/ `Char `| The length of the strings |
|`float`|32-bit IEEE754 floating point values |`Float32`| 4 Bytes |
|`double`|Double precision IEEE 754 floating point values|`Float64`| 8 Bytes |
|`boolean`|bool values|`Boolean`| 1 Byte |
|`varbinary`|Variable length binary values| `Binary`| The length of the data + 2 bytes|
|`date`|32-bit date values represent the days since UNIX Epoch |`Date`| 4 Bytes |
|`datetime`|64-bit datetime values represent the milliseconds since UNIX Epoch|`DateTime`| 8 Bytes |
|`timestamp[(0/3/6/9)]`|64-bit timestamp values with optional precision. <br /> For example, `timestamp(0)` represents timestamp type with seconds precision, `timestamp(3)` represents milliseconds precision, `timestamp(6)` for microseonds and `timestamp(9)` for nanoseconds. If no precision is given, the timestamp is in **milliseconds** precision by default.|`TimestampSecond`<br />/`TimestampMillisecond`<br />/`TimestampMicroSecond`<br />/`TimestampNanosecond` | 8 Bytes |

## Unsigned version of integer types

`int` / `tinyint` / `smallint` / `bigint` also have unsigned version, and there corresponding value ranges are:

- `int unsigned`: 0~4294967295
- `tinyint unsigned`: 0~255
- `smallint unsigned`: 0~65535
- `bigint unsigned`: 0~18446744073709551615
- `int unsigned / UInt8` : 0 ~ 4294967295
- `tinyint unsigned / UInt16` : 0 ~ 255
- `smallint unsigned / UInt32` : 0 ~ 65535
- `bigint unsigned / UInt64` : 0 ~ 18446744073709551615



## Variable-sized type limitations

The max capacities of variable-sized type, such as `string` and `varbinary` are determined by their encodings and how storage engine handles them.
The max capacities of variable-sized type, such as `string` and `varbinary` are determined by their encodings and how storage engine handles them.

For example, `string` values are encoded into UTF-8. If all characters are 3-bytes lengthed, this field can store 715827882 characters. As for `varbinary` types, it can store 2147483647 bytes at most.

### Choose the data type for timestamp column

GreptimeDB allows user to choose `bigint` or `timestamp` for timestamp index column.
Both `bigint` and `timestamp` are interpreted as timestamp in millisecond precision.

```SQL
# using TIMESTAMP as timestamp column data type
CREATE TABLE monitor (
host STRING,
ts TIMESTAMP,
cpu DOUBLE DEFAULT 0,
memory DOUBLE,
TIME INDEX (ts),
PRIMARY KEY(host)) ENGINE=mito WITH(regions=1);

# using BIGINT as timestamp column data type is also allowed
CREATE TABLE monitor (
host STRING,
ts BIGINT,
cpu DOUBLE DEFAULT 0,
memory DOUBLE,
TIME INDEX (ts),
PRIMARY KEY(host)) ENGINE=mito WITH(regions=1);

## Timestamp type alias

The `timestamp` type comes with various aliases, and it is suggested to use `TimestampSecond`, `TimestampMillisecond`, `TimestampMicrosecond`, and `TimestampNanosecond`.

The following table lists the corresponding alias.

|Type name|Alias|
|:-|:-:|
|TimestampSecond| Timestamp_s, Timestamp_sec , Timestamp(0)|
|TimestampMillisecond| Timestamp, Timestamp_ms , Timestamp(3)|
|TimestampMicrosecond| Timestamp_us , Timestamp(6)|
|TimestampNanosecond|Timestamp_ns , Timestamp(9)|

## Examples

### Create Table

```sql
CREATE TABLE data_types (
s String,
vbi Binary,
b Boolean,
tint Int8,
sint Int16,
i Int32,
bint Int64,
utint UInt8,
usint UInt16,
ui UInt32,
ubint UInt64,
f Float32,
d Float64,
dt Date,
dtt DateTime,
ts0 TimestampSecond,
ts3 TimestampMillisecond,
ts6 TimestampMicrosecond,
ts9 TimestampNanosecond DEFAULT CURRENT_TIMESTAMP TIME INDEX,
PRIMARY KEY(s));
```

### Describe Table

```sh
> describe table data_types;
+--------+----------------------+------+------+---------------------+---------------+
| Column | Type | Key | Null | Default | Semantic Type |
+--------+----------------------+------+------+---------------------+---------------+
| s | String | PRI | YES | | TAG |
| vbi | Binary | | YES | | FIELD |
| b | Boolean | | YES | | FIELD |
| tint | Int8 | | YES | | FIELD |
| sint | Int16 | | YES | | FIELD |
| i | Int32 | | YES | | FIELD |
| bint | Int64 | | YES | | FIELD |
| utint | UInt8 | | YES | | FIELD |
| usint | UInt16 | | YES | | FIELD |
| ui | UInt32 | | YES | | FIELD |
| ubint | UInt64 | | YES | | FIELD |
| f | Float32 | | YES | | FIELD |
| d | Float64 | | YES | | FIELD |
| dt | Date | | YES | | FIELD |
| dtt | DateTime | | YES | | FIELD |
| ts0 | TimestampSecond | | YES | | FIELD |
| ts3 | TimestampMillisecond | | YES | | FIELD |
| ts6 | TimestampMicrosecond | | YES | | FIELD |
| ts9 | TimestampNanosecond | PRI | NO | current_timestamp() | TIMESTAMP |
+--------+----------------------+------+------+---------------------+---------------+
```
117 changes: 80 additions & 37 deletions docs/zh/v0.4/reference/data-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,57 +2,100 @@

GreptimeDB 支持下列数据类型:

| Type name | Description | Synonyms | Size |
| 类型 | 描述 | 别名 | 大小 |
|:-:|:-:| :-:| :-:|
|`tinyint`| 8-bit signed small integers between -128~127|| 1 Byte |
|`smallint`| 16-bit signed big integers between -32768~32767 | |2 Bytes |
|`int`| 32-bit signed integers between -2147483648~2147483647| `integer`| 4 Bytes |
|`bigint`| 64-bit signed big integers between -9223372036854775808~9223372036854775807| | 8 Bytes |
|`varchar`|UTF-8 encoded strings|`text`<br />/`string`<br />/ `char`| The length of the strings |
|`float`|32-bit IEEE754 floating point values || 4 Bytes |
|`double`|Double precision IEEE 754 floating point values|| 8 Bytes |
|`boolean`|Boolean values|| 1 Byte |
|`varbinary`|Variable length binary values| | The length of the data + 2 bytes|
|`date`|32-bit date values|| 4 Bytes |
|`datetime`|64-bit datetime values|| 8 Bytes |
|`timestamp[(0/3/6/9)]`|64-bit timestamp values with optional precision. <br /> For example, `timestamp(0)` represents timestamp type with seconds precision, `timestamp(3)` represents milliseconds precision, `timestamp(6)` for microseonds and `timestamp(9)` for nanoseconds. If no precision is given, the timestamp is in **milliseconds** precision by default.|| 8 Bytes |
|`tinyint`| -128 ~ 127|`Int8`| 1 Byte |
|`smallint`| -32768 ~ 32767 | `Int16`|2 Bytes |
|`int`| -2147483648 ~ 2147483647| `Int32`| 4 Bytes |
|`bigint`| -9223372036854775808 ~ 9223372036854775807| `Int64` | 8 Bytes |
|`varchar`|UTF-8 encoded strings|`Text`<br />/`String`<br />/ `Char `| The length of the strings |
|`float`|32-bit IEEE754 floating point values |`Float32`| 4 Bytes |
|`double`|Double precision IEEE 754 floating point values|`Float64`| 8 Bytes |
|`boolean`|bool values|`Boolean`| 1 Byte |
|`varbinary`|Variable length binary values| `Binary`| The length of the data + 2 bytes|
|`date`|32-bit date values represent the days since UNIX Epoch |`Date`| 4 Bytes |
|`datetime`|64-bit datetime values represent the milliseconds since UNIX Epoch|`DateTime`| 8 Bytes |
|`timestamp[(0/3/6/9)]`|64-bit timestamp values with optional precision. <br /> For example, `timestamp(0)` represents timestamp type with seconds precision, `timestamp(3)` represents milliseconds precision, `timestamp(6)` for microseonds and `timestamp(9)` for nanoseconds. If no precision is given, the timestamp is in **milliseconds** precision by default.|`TimestampSecond`<br />/`TimestampMillisecond`<br />/`TimestampMicroSecond`<br />/`TimestampNanosecond` | 8 Bytes |

## 整数类型的 Unsigned 版本

`int` / `tinyint` / `smallint` / `bigint` 有 unsigned 版本,相应的值范围如下:

- `int unsigned`: 0~4294967295
- `tinyint unsigned`: 0~255
- `smallint unsigned`: 0~65535
- `bigint unsigned`: 0~18446744073709551615
- `int unsigned / UInt8` : 0 ~ 4294967295
- `tinyint unsigned / UInt16` : 0 ~ 255
- `smallint unsigned / UInt32` : 0 ~ 65535
- `bigint unsigned / UInt64` : 0 ~ 18446744073709551615

## Variable-sized 类型的限制

variable-sized 类型的最大容量, 例如 `string``varbinary`,取决于它们的编码和存储引擎处理它们的方式。

例如,`string` 值被编码为 UTF-8。如果所有字符都是 3 字节长度,则该字段可以存储 715827882 个字符。对于 `varbinary` 类型,最多可以存储 2147483647 字节。

### 为 timestamp 列选择数据类型
## Timestamp 类型别名

GreptimeDB 允许用户为时间戳索引列设置为 `bigint``timestamp` 类型。
`bigint``timestamp` 都被解释为毫秒精度的时间戳。
`timestamp` 类型带有各种别名,建议使用 `TimestampSecond``TimestampMillisecond``TimestampMicrosecond``TimestampNanosecond`。下表列出了对应的别名。

|Type name|Alias|
|:-|:-:|
|TimestampSecond| Timestamp_s, Timestamp_sec , Timestamp(0)|
|TimestampMillisecond| Timestamp, Timestamp_ms , Timestamp(3)|
|TimestampMicrosecond| Timestamp_us , Timestamp(6)|
|TimestampNanosecond|Timestamp_ns , Timestamp(9)|

## Examples

### Create Table

```sql
# using TIMESTAMP as timestamp column data type
CREATE TABLE monitor (
host STRING,
ts TIMESTAMP,
cpu DOUBLE DEFAULT 0,
memory DOUBLE,
TIME INDEX (ts),
PRIMARY KEY(host)) ENGINE=mito WITH(regions=1);

# using BIGINT as timestamp column data type is also allowed
CREATE TABLE monitor (
host STRING,
ts BIGINT,
cpu DOUBLE DEFAULT 0,
memory DOUBLE,
TIME INDEX (ts),
PRIMARY KEY(host)) ENGINE=mito WITH(regions=1);
CREATE TABLE data_types (
s String,
vbi Binary,
b Boolean,
tint Int8,
sint Int16,
i Int32,
bint Int64,
utint UInt8,
usint UInt16,
ui UInt32,
ubint UInt64,
f Float32,
d Float64,
dt Date,
dtt DateTime,
ts0 TimestampSecond,
ts3 TimestampMillisecond,
ts6 TimestampMicrosecond,
ts9 TimestampNanosecond DEFAULT CURRENT_TIMESTAMP TIME INDEX,
PRIMARY KEY(s));
```

### Describe Table

```sh
> describe table data_types;
+--------+----------------------+------+------+---------------------+---------------+
| Column | Type | Key | Null | Default | Semantic Type |
+--------+----------------------+------+------+---------------------+---------------+
| s | String | PRI | YES | | TAG |
| vbi | Binary | | YES | | FIELD |
| b | Boolean | | YES | | FIELD |
| tint | Int8 | | YES | | FIELD |
| sint | Int16 | | YES | | FIELD |
| i | Int32 | | YES | | FIELD |
| bint | Int64 | | YES | | FIELD |
| utint | UInt8 | | YES | | FIELD |
| usint | UInt16 | | YES | | FIELD |
| ui | UInt32 | | YES | | FIELD |
| ubint | UInt64 | | YES | | FIELD |
| f | Float32 | | YES | | FIELD |
| d | Float64 | | YES | | FIELD |
| dt | Date | | YES | | FIELD |
| dtt | DateTime | | YES | | FIELD |
| ts0 | TimestampSecond | | YES | | FIELD |
| ts3 | TimestampMillisecond | | YES | | FIELD |
| ts6 | TimestampMicrosecond | | YES | | FIELD |
| ts9 | TimestampNanosecond | PRI | NO | current_timestamp() | TIMESTAMP |
+--------+----------------------+------+------+---------------------+---------------+
```

0 comments on commit b734b3f

Please sign in to comment.