Skip to content

Commit

Permalink
[improvement](doc) add missing documents (apache#14460)
Browse files Browse the repository at this point in the history
  • Loading branch information
morningman authored Nov 22, 2022
1 parent ab83465 commit 6eeebd4
Show file tree
Hide file tree
Showing 19 changed files with 733 additions and 85 deletions.
64 changes: 64 additions & 0 deletions docs/en/community/how-to-contribute/contribute-doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,70 @@ Directory structure description:
All images are in the `static/images` directory
## How to write SQL manual
SQL manual doc refers to the documentation under `docs/sql-manual`. These documents are used in two places:
1. Official website document.
2. The output of the HELP command.
In order to support HELP command output, these documents need to be written in strict accordance with the following format, otherwise they will fail the admission check.
An example of the `SHOW ALTER` command is as follows:
```
---
{
"title": "SHOW-ALTER",
"language": "en"
}
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
## SHOW-ALTER
### Nameo
SHOW ALTER
### Description
(Describe the syntax)
### Example
(Give some example)
### Keywords
SHOW, ALTER
### Best Practice
(Optional)
```
Note that, regardless of Chinese or English documents, the above headings are in English, and pay attention to the level of the headings.
## Multiple Versions
Website documentation supports version tagging via html tags. You can use the `<version>` tag to mark which version a section of content in the document started from, or which version it was removed from.
Expand Down
49 changes: 49 additions & 0 deletions docs/en/docs/admin-manual/config/config-dir.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
{
"title": "Config Dir",
"language": "en"
}
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Config Dir

The configuration file directory for FE and BE is `conf/`. In addition to storing the default fe.conf, be.conf and other files, this directory is also used for the common configuration file storage directory.

Users can store some configuration files in it, and the system will automatically read them.

<version since="1.2.0">

## hdfs-site.xml and hive-site.xml

In some functions of Doris, you need to access data on HDFS, or access Hive metastore.

We can manually fill in various HDFS/Hive parameters in the corresponding statement of the function.

But these parameters are very many, if all are filled in manually, it is very troublesome.

Therefore, users can place the HDFS or Hive configuration file hdfs-site.xml/hive-site.xml directly in the `conf/` directory. Doris will automatically read these configuration files.

The configuration that the user fills in the command will overwrite the configuration items in the configuration file.

In this way, users only need to fill in a small amount of configuration to complete the access to HDFS/Hive.

</version>
13 changes: 13 additions & 0 deletions docs/en/docs/admin-manual/maint-monitor/metadata-operation.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,18 @@ curl -u $root_user:$password http://$master_hostname:8030/dump
```
3. Replace the image file in the `meta_dir/image` directory on the OBSERVER FE node with the image_mem file, restart the OBSERVER FE node, and verify the integrity and correctness of the image_mem file. You can check whether the DB and Table metadata are normal on the FE Web page, whether there is an exception in `fe.log`, whether it is in a normal replayed jour.

Since 1.2.0, it is recommanded to use following method to verify the `image_mem` file:

```
sh start_fe.sh --image path_to_image_mem
```
> Notice: `path_to_image_mem` is the path of `image_mem`.
>
> If verify succeed, it will print: `Load image success. Image file /absolute/path/to/image.xxxxxx is valid`.
>
> If verify failed, it will print: `Load image failed. Image file /absolute/path/to/image.xxxxxx is invalid`.
4. Replace the image file in the `meta_dir/image` directory on the FOLLOWER FE node with the image_mem file in turn, restart the FOLLOWER FE node, and confirm that the metadata and query services are normal.
5. Replace the image file in the `meta_dir/image` directory on the Master FE node with the image_mem file, restart the Master FE node, and then confirm that the FE Master switch is normal and The Master FE node can generate a new image file through checkpoint.
Expand Down Expand Up @@ -393,3 +405,4 @@ The deployment recommendation of FE is described in the Installation and [Deploy
```
This means that some transactions that have been persisted need to be rolled back, but the number of entries exceeds the upper limit. Here our default upper limit is 100, which can be changed by setting `txn_rollback_limit`. This operation is only used to attempt to start FE normally, but lost metadata cannot be recovered.
42 changes: 41 additions & 1 deletion docs/en/docs/advanced/broker.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,13 @@ under the License.

# Broker

Broker is an optional process in the Doris cluster. It is mainly used to support Doris to read and write files or directories on remote storage, such as HDFS, BOS, and AFS.
Broker is an optional process in the Doris cluster. It is mainly used to support Doris to read and write files or directories on remote storage. Now support:

- Apache HDFS
- Aliyun OSS
- Tencent Cloud CHDFS
- Huawei Cloud OBS (since 1.2.0)
- Amazon S3

Broker provides services through an RPC service port. It is a stateless JVM process that is responsible for encapsulating some POSIX-like file operations for read and write operations on remote storage, such as open, pred, pwrite, and so on.
In addition, the Broker does not record any other information, so the connection information, file information, permission information, and so on stored remotely need to be passed to the Broker process in the RPC call through parameters in order for the Broker to read and write files correctly .
Expand Down Expand Up @@ -194,3 +200,37 @@ Authentication information is usually provided as a Key-Value in the Property Ma
)
```
The configuration for accessing the HDFS cluster can be written to the hdfs-site.xml file. When users use the Broker process to read data from the HDFS cluster, they only need to fill in the cluster file path and authentication information.
#### Tencent Cloud CHDFS
Same as Apache HDFS
#### Aliyun OSS
```
(
"fs.oss.accessKeyId" = "",
"fs.oss.accessKeySecret" = "",
"fs.oss.endpoint" = ""
)
```
#### Huawei Cloud OBS
```
(
"fs.obs.access.key" = "xx",
"fs.obs.secret.key" = "xx",
"fs.obs.endpoint" = "xx"
)
```
#### Amazon S3
```
(
"fs.s3a.access.key" = "xx",
"fs.s3a.secret.key" = "xx",
"fs.s3a.endpoint" = "xx"
)
```
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,10 @@ This statement is used to undo an import job for the specified label. Or batch u
```sql
CANCEL LOAD
[FROM db_name]
WHERE [LABEL = "load_label" | LABEL like "label_pattern"];
````
WHERE [LABEL = "load_label" | LABEL like "label_pattern" | STATE = "PENDING/ETL/LOADING"]
```

Notice: Cancel by State is supported since 1.2.0.

### Example

Expand All @@ -58,6 +60,18 @@ WHERE [LABEL = "load_label" | LABEL like "label_pattern"];
WHERE LABEL like "example_";
````
<version since="1.2.0">
3. Cancel all import jobs which state are "LOADING"
```sql
CANCEL LOAD
FROM example_db
WHERE STATE = "loading";
```

</version>

### Keywords

CANCEL, LOAD
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,54 +48,60 @@ illustrate:

file_path points to the path where the file is stored and the file prefix. Such as `hdfs://path/to/my_file_`.

The final filename will consist of `my_file_`, the file number and the file format suffix. The file serial number starts from 0, and the number is the number of files to be divided. Such as:
```
The final filename will consist of `my_file_`, the file number and the file format suffix. The file serial number starts from 0, and the number is the number of files to be divided. Such as:
my_file_abcdefg_0.csv
my_file_abcdefg_1.csv
my_file_abcdegf_2.csv
```
2. format_as
```
FORMAT AS CSV
```
Specifies the export format. Supported formats include CSV, PARQUET, CSV_WITH_NAMES, CSV_WITH_NAMES_AND_TYPES and ORC. Default is CSV.
3. properties
Specify related properties. Currently exporting via the Broker process, or via the S3 protocol is supported.
grammar:
[PROPERTIES ("key"="value", ...)]
The following properties are supported:
column_separator: column separator
line_delimiter: line delimiter
max_file_size: the size limit of a single file, if the result exceeds this value, it will be cut into multiple files.
```
grammar:
[PROPERTIES ("key"="value", ...)]
The following properties are supported:
column_separator: column separator. <version since="1.2.0">Support mulit-bytes, such as: "\\x01", "abc"</version>
line_delimiter: line delimiter. <version since="1.2.0">Support mulit-bytes, such as: "\\x01", "abc"</version>
max_file_size: the size limit of a single file, if the result exceeds this value, it will be cut into multiple files.
Broker related properties need to be prefixed with `broker.`:
broker.name: broker name
broker.hadoop.security.authentication: specify the authentication method as kerberos
broker.kerberos_principal: specifies the principal of kerberos
broker.kerberos_keytab: specifies the path to the keytab file of kerberos. The file must be the absolute path to the file on the server where the broker process is located. and can be accessed by the Broker process
Broker related properties need to be prefixed with `broker.`:
broker.name: broker name
broker.hadoop.security.authentication: specify the authentication method as kerberos
broker.kerberos_principal: specifies the principal of kerberos
broker.kerberos_keytab: specifies the path to the keytab file of kerberos. The file must be the absolute path to the file on the server where the broker process is located. and can be accessed by the Broker process
HDFS related properties:
fs.defaultFS: namenode address and port
hadoop.username: hdfs username
dfs.nameservices: if hadoop enable HA, please set fs nameservice. See hdfs-site.xml
dfs.ha.namenodes.[nameservice ID]:unique identifiers for each NameNode in the nameservice. See hdfs-site.xml
dfs.namenode.rpc-address.[nameservice ID].[name node ID]`:the fully-qualified RPC address for each NameNode to listen on. See hdfs-site.xml
dfs.client.failover.proxy.provider.[nameservice ID]:the Java class that HDFS clients use to contact the Active NameNode, usually it is org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

For a kerberos-authentication enabled Hadoop cluster, additional properties need to be set:
dfs.namenode.kerberos.principal: HDFS namenode service principal
hadoop.security.authentication: kerberos
hadoop.kerberos.principal: the Kerberos pincipal that Doris will use when connectiong to HDFS.
hadoop.kerberos.keytab: HDFS client keytab location.
HDFS related properties:
fs.defaultFS: namenode address and port
hadoop.username: hdfs username
dfs.nameservices: if hadoop enable HA, please set fs nameservice. See hdfs-site.xml
dfs.ha.namenodes.[nameservice ID]:unique identifiers for each NameNode in the nameservice. See hdfs-site.xml
dfs.namenode.rpc-address.[nameservice ID].[name node ID]`:the fully-qualified RPC address for each NameNode to listen on. See hdfs-site.xml
dfs.client.failover.proxy.provider.[nameservice ID]:the Java class that HDFS clients use to contact the Active NameNode, usually it is org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
For a kerberos-authentication enabled Hadoop cluster, additional properties need to be set:
dfs.namenode.kerberos.principal: HDFS namenode service principal
hadoop.security.authentication: kerberos
hadoop.kerberos.principal: the Kerberos pincipal that Doris will use when connectiong to HDFS.
hadoop.kerberos.keytab: HDFS client keytab location.
For the S3 protocol, you can directly execute the S3 protocol configuration:
AWS_ENDPOINT
AWS_ACCESS_KEY
AWS_SECRET_KEY
AWS_REGION
For the S3 protocol, you can directly execute the S3 protocol configuration:
AWS_ENDPOINT
AWS_ACCESS_KEY
AWS_SECRET_KEY
AWS_REGION
```
### example
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
{
"title": "ADMIN-CANCEL-REBALANCE-DISK",
"language": "en"
}
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

## ADMIN-CANCEL-REBALANCE-DISK

<version since="1.2.0">

### Name

ADMIN CANCEL REBALANCE DISK

### Description

This statement is used to cancel rebalancing disks of specified backends with high priority

Grammar:

ADMIN CANCEL REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)];

Explain:

1. This statement only indicates that the system no longer rebalance disks of specified backends with high priority. The system will still rebalance disks by default scheduling.

### Example

1. Cancel High Priority Disk Rebalance of all of backends of the cluster

ADMIN CANCEL REBALANCE DISK;

2. Cancel High Priority Disk Rebalance of specified backends

ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234");

### Keywords

ADMIN,CANCEL,REBALANCE DISK

### Best Practice

</version>

Loading

0 comments on commit 6eeebd4

Please sign in to comment.