Skip to content

Commit

Permalink
[typo](storage)Fixed wrong description about Storage_root_path parame…
Browse files Browse the repository at this point in the history
…ter (apache#20641)
  • Loading branch information
FreeOnePlus authored Jun 28, 2023
1 parent a6b51ec commit 274203a
Show file tree
Hide file tree
Showing 3 changed files with 56 additions and 63 deletions.
10 changes: 4 additions & 6 deletions conf/be.conf
Original file line number Diff line number Diff line change
Expand Up @@ -54,13 +54,11 @@ enable_auth = false
# priority_networks = 10.10.10.0/24;192.168.0.0/16

# data root path, separate by ';'
# you can specify the storage medium of each root path, HDD or SSD
# you can add capacity limit at the end of each root path, separate by ','
# You can specify the storage type for each root path, HDD (cold data) or SSD (hot data)
# eg:
# storage_root_path = /home/disk1/doris.HDD,50;/home/disk2/doris.SSD,1;/home/disk2/doris
# /home/disk1/doris.HDD, capacity limit is 50GB, HDD;
# /home/disk2/doris.SSD, capacity limit is 1GB, SSD;
# /home/disk2/doris, capacity limit is disk capacity, HDD(default)
# storage_root_path = /home/disk1/doris;/home/disk2/doris;/home/disk2/doris
# storage_root_path = /home/disk1/doris,medium:SSD;/home/disk2/doris,medium:SSD;/home/disk2/doris,medium:HDD
# /home/disk2/doris,medium:HDD(default)
#
# you also can specify the properties by setting '<property>:<value>', separate by ','
# property 'medium' has a higher priority than the extension of path
Expand Down
81 changes: 39 additions & 42 deletions docs/en/docs/install/standard-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ Both ext4 and xfs file systems are supported.
>
> 1. FE nodes are divided into Followers and Observers based on their roles. (Leader is an elected role in the Follower group, hereinafter referred to as Follower, too.)
> 2. The number of FE nodes should be at least 1 (1 Follower). If you deploy 1 Follower and 1 Observer, you can achieve high read availability; if you deploy 3 Followers, you can achieve high read-write availability (HA).
> 3. The number of Followers **must be** odd, and there is no limit on the number of Observers.
> 3. Although multiple BEs can be deployed on one machine, **only one instance** is recommended to be deployed, and **only one FE** can be deployed at the same time. If 3 copies of data are required, at least 3 machines are required to deploy a BE instance (instead of 1 machine deploying 3 BE instances). **The clocks of the servers where multiple FEs are located must be consistent (up to 5 seconds of clock deviation is allowed)**.
> 4. According to past experience, for business that requires high cluster availability (e.g. online service providers), we recommend that you deploy 3 Followers and 1-3 Observers; for offline business, we recommend that you deploy 1 Follower and 1-3 Observers.
* **Usually we recommend 10 to 100 machines to give full play to Doris' performance (deploy FE on 3 of them (HA) and BE on the rest).**
Expand Down Expand Up @@ -187,80 +187,77 @@ See the `lower_case_table_names` section in [Variables](../advanced/variables.md

* For details about deployment of multiple FEs, see the [FE scaling](https://doris.apache.org/docs/dev/admin-manual/cluster-management/elastic-expansion/) section.

#### Deploy BE
#### BE Deployment

* Copy the BE deployment file into all nodes that are to deploy BE on
* Copy the BE deployment file to all nodes to deploy BE

Find the BE folder under the output generated by source code compilation, copy it into to the specified deployment paths of the BE nodes.
Copy the be folder under the output generated by source code compilation to the specified deployment path of the BE node.

* Modify all BE configurations

Modify be/conf/be.conf, which mainly involves configuring `storage_root_path`: data storage directory. By default, under be/storage, the directory needs to be **created manually**. Use `;` to separate multiple paths (do not add `;` after the last directory).
> Note: The `output/be/lib/debug_info/` directory contains debugging information files, which are relatively large, but these files are not needed for actual operation and can not be deployed.
* Modify all BE configurations

You may specify the directory storage medium in the path: HDD or SSD. You may also add capacity limit to the end of every path and use `,` for separation. Unless you use a mix of SSD and HDD disks, you do not need to follow the configuration methods in Example 1 or Example 2 below, but only need to specify the storage directory; you do not need to modify the default storage medium configuration of FE, either.
Modify be/conf/be.conf. Mainly configure `storage_root_path`: data storage directory. By default, it is under be/storage. If you need to specify a directory, you need to **pre-create the directory**. Multiple paths are separated by a semicolon `;` in English (**do not add `;`** after the last directory).
The hot and cold data storage directories in the node can be distinguished by path, HDD (cold data directory) or SSD (hot data directory). If you don't need the hot and cold mechanism in the BE node, you only need to configure the path without specifying the medium type; and you don't need to modify the default storage medium configuration of FE

Example 1:
**Notice:**
1. If you specify the storage type of the storage path, at least one path must have a storage type of HDD (cold data directory)!
2. If the storage type of the storage path is not specified, all are HDD (cold data directory) by default.
3. The HDD and SSD here have nothing to do with the physical storage medium, but only to distinguish the storage type of the storage path, that is, you can mark a certain directory on the disk of the HDD medium as SSD (hot data directory).
4. Here HDD and SSD **MUST** be capitalized!

Note: For SSD disks, add `.SSD` to the end of the directory; for HDD disks, add `.HDD`.
Example 1 is as follows:

`storage_root_path=/home/disk1/doris.HDD;/home/disk2/doris.SSD;/home/disk2/doris`
`storage_root_path=/home/disk1/doris;/home/disk2/doris;/home/disk2/doris`

**Description**
Example 2 is as follows:

* 1./home/disk1/doris.HDD: The storage medium is HDD;
* 2./home/disk2/doris.SSD: The storage medium is SSD;
* 3./home/disk2/doris: The storage medium is HDD (default).
**Use the storage_root_path parameter to specify medium**

Example 2:
`storage_root_path=/home/disk1/doris,medium:HDD;/home/disk2/doris,medium:SSD`

Note: You do not need to add the `.SSD` or `.HDD` suffix, but to specify the medium in the `storage_root_path` parameter
**illustrate**

`storage_root_path=/home/disk1/doris,medium:HDD;/home/disk2/doris,medium:SSD`

**Description**

- /home/disk1/doris,medium:HDD: Indicates that the directory stores cold data;
- /home/disk2/doris,medium:SSD: Indicates that the directory stores hot data;

* 1./home/disk1/doris,medium:HDD : The storage medium is HDD;
* 2./home/disk2/doris,medium:SSD : The storage medium is SSD.
* BE webserver_port port configuration

* BE webserver_port configuration
If be is deployed in a hadoop cluster, pay attention to adjusting `webserver_port = 8040` in be.conf to avoid port conflicts

If the BE component is installed in hadoop cluster, you need to change configuration `webserver_port=8040` to avoid port used.
* Configure the JAVA_HOME environment variable

* Set JAVA_HOME environment variable
<version since="1.2.0"></version>
Since Java UDF functions are supported from version 1.2, BE depends on the Java environment. So to pre-configure the `JAVA_HOME` environment variable, you can also add `export JAVA_HOME=your_java_home_path` to the first line of the `start_be.sh` startup script to add the environment variable.

<version since="1.2.0"></version>
Java UDF is supported since version 1.2, so BEs are dependent on the Java environment. It is necessary to set the `JAVA_HOME` environment variable before starting. You can also do this by adding `export JAVA_HOME=your_java_home_path` to the first line of the `start_be.sh` startup script.
* Install Java UDF functions

* Install Java UDF
<version since="1.2.0">Install Java UDF functions</version>
Because Java UDF functions are supported from version 1.2, you need to download the JAR package of Java UDF functions from the official website and put them in the lib directory of BE, otherwise it may fail to start.

<version since="1.2.0"></version>
Because Java UDF is supported since version 1.2, you need to download the JAR package of Java UDF from the official website and put them under the lib directory of BE, otherwise it may fail to start.

* Add all BE nodes to FE
* Add all BE nodes in FE

BE nodes need to be added in FE before they can join the cluster. You can use mysql-client ([Download MySQL 5.7](https://dev.mysql.com/downloads/mysql/5.7.html)) to connect to FE:
BE nodes need to be added in FE before they can join the cluster. You can use mysql-client ([Download MySQL 5.7](https://dev.mysql.com/downloads/mysql/5.7.html)) to connect to FE:

`./mysql-client -h fe_host -P query_port -uroot`
`./mysql-client -h fe_host -P query_port -uroot`

`fe_host` is the node IP where FE is located; `query_port` is in fe/conf/fe.conf; the root account is used by default and no password is required in login.
Among them, fe_host is the ip of the node where FE is located; query_port is in fe/conf/fe.conf; the root account is used by default, and there is no password to log in.

After login, execute the following command to add all the BE host and heartbeat service port:
Once logged in, execute the following command to add each BE:

`ALTER SYSTEM ADD BACKEND "be_host:heartbeat_service_port";`
`ALTER SYSTEM ADD BACKEND "be_host:heartbeat-service_port";`

`be_host` is the node IP where BE is located; `heartbeat_service_port` is in be/conf/be.conf.
Where be_host is the node ip where BE is located; heartbeat_service_port is in be/conf/be.conf.

* Start BE

`bin/start_be.sh --daemon`
`bin/start_be.sh --daemon`

The BE process will start and go into the background for execution. Logs are stored in be/log/directory by default. If startup fails, you can view error messages by checking out be/log/be.log or be/log/be.out.
The BE process will start and enter the background execution. Logs are stored in the be/log/ directory by default. If the startup fails, you can view the error message by viewing be/log/be.log or be/log/be.out.

* View BE status

Connect to FE using mysql-client and execute `SHOW PROC '/backends'; ` to view BE operation status. If everything goes well, the `isAlive`column should be `true`.
Use mysql-client to connect to FE, and execute `SHOW PROC '/backends';` to check the running status of BE. If all is well, the `isAlive` column should be `true`.

#### (Optional) FS_Broker Deployment

Expand Down
28 changes: 13 additions & 15 deletions docs/zh-CN/docs/install/standard-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ ext4和xfs文件系统均支持。
> 注1:
> 1. FE 的磁盘空间主要用于存储元数据,包括日志和 image。通常从几百 MB 到几个 GB 不等。
> 2. BE 的磁盘空间主要用于存放用户数据,总磁盘空间按用户总数据量 * 3(3副本)计算,然后再预留额外 40% 的空间用作后台 compaction 以及一些中间数据的存放。
> 3. 一台机器上可以部署多个 BE 实例,但是**只能部署一个 FE**。如果需要 3 副本数据,那么至少需要 3 台机器各部署一个 BE 实例(而不是1台机器部署3个BE实例)。**多个FE所在服务器的时钟必须保持一致(允许最多5秒的时钟偏差)**
> 3. 一台机器上虽然可以部署多个 BE**但只建议部署一个实例**,同时**只能部署一个 FE**。如果需要 3 副本数据,那么至少需要 3 台机器各部署一个 BE 实例(而不是1台机器部署3个BE实例)。**多个FE所在服务器的时钟必须保持一致(允许最多5秒的时钟偏差)**
> 4. 测试环境也可以仅适用一个 BE 进行测试。实际生产环境,BE 实例数量直接决定了整体查询延迟。
> 5. 所有部署节点关闭 Swap。
Expand Down Expand Up @@ -192,31 +192,29 @@ doris默认为表名大小写敏感,如有表名大小写不敏感的需求需
* 修改所有 BE 的配置

修改 be/conf/be.conf。主要是配置 `storage_root_path`:数据存放目录。默认在be/storage下,需要**手动创建**该目录。多个路径之间使用英文状态的分号 `;` 分隔(**最后一个目录后不要加 `;`**)。
可以通过路径区别存储目录的介质,HDD或SSD。可以添加容量限制在每个路径的末尾,通过英文状态逗号`,`隔开。如果用户不是SSD和HDD磁盘混合使用的情况,不需要按照如下示例一和示例二的配置方法配置,只需指定存储目录即可;也不需要修改FE的默认存储介质配置
修改 be/conf/be.conf。主要是配置 `storage_root_path`:数据存放目录。默认在be/storage下,若需要指定目录的话,需要**预创建目录**。多个路径之间使用英文状态的分号 `;` 分隔(**最后一个目录后不要加 `;`**)。
可以通过路径区别节点内的冷热数据存储目录,HDD(冷数据目录)或 SSD(热数据目录)。如果不需要 BE 节点内的冷热机制,那么只需要配置路径即可,无需指定 medium 类型;也不需要修改FE的默认存储介质配置

示例1如下:

**注意:如果是SSD磁盘要在目录后面加上`.SSD`,HDD磁盘在目录后面加`.HDD`**

`storage_root_path=/home/disk1/doris.HDD;/home/disk2/doris.SSD;/home/disk2/doris`
**注意:**
1. 如果指定存储路径的存储类型,则最少要有一个路径的存储类型为 HDD(冷数据目录)!
2. 如果未指定存储路径的存储类型,则默认全部为 HDD(冷数据目录)。
3. 这里的 HDD 和 SSD 与物理存储介质无关,只为了区分存储路径的存储类型,即可以在 HDD 介质的盘上标记某个目录为 SSD(热数据目录)。
4. 这里的 HDDSSD **必须**要大写!

**说明**
示例1如下:

- /home/disk1/doris.HDD : 表示存储介质是HDD;
- /home/disk2/doris.SSD: 表示存储介质是SSD;
- /home/disk2/doris: 表示存储介质是HDD(默认)
`storage_root_path=/home/disk1/doris;/home/disk2/doris;/home/disk2/doris`

示例2如下:

**注意:不论HDD磁盘目录还是SSD磁盘目录,都无需添加后缀,storage_root_path参数里指定medium即可**
**使用 storage_root_path 参数里指定 medium**

`storage_root_path=/home/disk1/doris,medium:HDD;/home/disk2/doris,medium:SSD`

**说明**

- /home/disk1/doris,medium:HDD表示存储介质是HDD;
- /home/disk2/doris,medium:SSD表示存储介质是SSD;
- /home/disk1/doris,medium:HDD表示该目录存储冷数据;
- /home/disk2/doris,medium:SSD表示该目录存储热数据;

* BE webserver_port端口配置

Expand Down

0 comments on commit 274203a

Please sign in to comment.