Skip to content

Commit 59816df

Browse files
elekajayydv
authored andcommitted
HDDS-1462. Fix content and format of Ozone documentation. Contributed by Elek, Marton. (#767)
1 parent 5bca062 commit 59816df

File tree

17 files changed

+162
-88
lines changed

17 files changed

+162
-88
lines changed

hadoop-hdds/docs/content/BucketCommands.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
title: Bucket Commands
33
menu:
44
main:
5-
parent: Client
6-
weight: 3
5+
parent: OzoneShell
6+
weight: 2
77
---
88
<!---
99
Licensed to the Apache Software Foundation (ASF) under one or more

hadoop-hdds/docs/content/BuildingSources.md

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -35,20 +35,32 @@ the ozone build command. This instruction assumes that you have all the
3535
dependencies to build Hadoop on your build machine. If you need instructions
3636
on how to build Hadoop, please look at the Apache Hadoop Website.
3737

38-
{{< highlight bash >}}
39-
mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true -Phdds -Pdist -Dtar -DskipShade
40-
{{< /highlight >}}
38+
```bash
39+
mvn -f pom.ozone.xml clean package -DskipTests=true
40+
```
4141

42-
43-
This will build an ozone-\<version\>.tar.gz in your target directory.
42+
This will build an ozone-\<version\>.tar.gz in your `hadoop-ozone/dist/target` directory.
4443

4544
You can copy this tarball and use this instead of binary artifacts that are
4645
provided along with the official release.
4746

4847
## How to test the build
4948
You can run the acceptance tests in the hadoop-ozone directory to make sure
5049
that your build is functional. To launch the acceptance tests, please follow
51-
the instructions in the **README.md** in the
52-
```$hadoop_src/hadoop-ozone/acceptance-test``` directory. Acceptance tests
50+
the instructions in the **README.md** in the `smoketest` directory.
51+
52+
```bash
53+
cd smoketest
54+
./test.sh
55+
```
56+
57+
You can also execute only a minimal subset of the tests:
58+
59+
```bash
60+
cd smoketest
61+
./test.sh --env ozone basic
62+
```
63+
64+
Acceptance tests
5365
will start a small ozone cluster and verify that ozone shell and ozone file
54-
system is fully functional.
66+
system is fully functional.

hadoop-hdds/docs/content/CommandShell.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ menu:
44
main:
55
parent: Client
66
weight: 1
7+
identifier: OzoneShell
78
---
89
<!---
910
Licensed to the Apache Software Foundation (ASF) under one or more

hadoop-hdds/docs/content/KeyCommands.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: Key Commands
33
menu:
44
main:
5-
parent: Client
5+
parent: OzoneShell
66
weight: 3
77
---
88
<!---

hadoop-hdds/docs/content/OzoneFS.md

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -87,18 +87,41 @@ hdfs dfs -ls o3fs://bucket.volume.om-host.example.com:5678/key
8787
{{< /highlight >}}
8888

8989

90-
## Legacy mode
90+
## Supporting older Hadoop version (Legacy jar, BasicOzoneFilesystem)
9191

92-
There are two ozonefs files which includes all the dependencies:
92+
There are two ozonefs files, both of them include all the dependencies:
9393

9494
* share/ozone/lib/hadoop-ozone-filesystem-lib-current-VERSION.jar
9595
* share/ozone/lib/hadoop-ozone-filesystem-lib-legacy-VERSION.jar
9696

97-
The first one contains all the required dependency to use ozonefs with a
98-
compatible hadoop version (hadoop 3.2 / 3.1).
97+
The first one contains all the required dependency to use ozonefs with a
98+
compatible hadoop version (hadoop 3.2).
9999

100-
The second one contains all the dependency in an internal, separated directory,
100+
The second one contains all the dependency in an internal, separated directory,
101101
and a special class loader is used to load all the classes from the location.
102102

103+
With this method the hadoop-ozone-filesystem-lib-legacy.jar can be used from
104+
any older hadoop version (eg. hadoop 3.1, hadoop 2.7 or spark+hadoop 2.7)
105+
106+
Similar to the dependency jar, there are two OzoneFileSystem implementation.
107+
108+
For hadoop 3.0 and newer, you can use `org.apache.hadoop.fs.ozone.OzoneFileSystem`
109+
which is a full implementation of the Hadoop compatible File System API.
110+
111+
For Hadoop 2.x you should use the Basic version: `org.apache.hadoop.fs.ozone.BasicOzoneFileSystem`.
112+
113+
This is the same implementation but doesn't include the features/dependencies which are added with
114+
Hadoop 3.0. (eg. FS statistics, encryption zones).
115+
116+
### Summary
117+
118+
The following table summarize which jar files and implementation should be used:
119+
120+
Hadoop version | Required jar | OzoneFileSystem implementation
121+
---------------|-------------------------|----------------------------------------------------
122+
3.2 | filesystem-lib-current | org.apache.hadoop.fs.ozone.OzoneFileSystem
123+
3.1 | filesystem-lib-legacy | org.apache.hadoop.fs.ozone.OzoneFileSystem
124+
2.9 | filesystem-lib-legacy | org.apache.hadoop.fs.ozone.BasicOzoneFileSystem
125+
2.7 | filesystem-lib-legacy | org.apache.hadoop.fs.ozone.BasicOzoneFileSystem
103126
With this method the hadoop-ozone-filesystem-lib-legacy.jar can be used from
104127
any older hadoop version (eg. hadoop 2.7 or spark+hadoop 2.7)

hadoop-hdds/docs/content/OzoneSecurityArchitecture.md

Lines changed: 31 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -32,60 +32,79 @@ Starting with badlands release (ozone-0.4.0-alpha) ozone cluster can be secured
3232
4. Transparent Data Encryption (TDE)
3333

3434
## Authentication ##
35+
3536
### Kerberos ###
3637
Similar to hadoop, Ozone allows kerberos-based authentication. So one way to setup identities for all the daemons and clients is to create kerberos keytabs and configure it like any other service in hadoop.
3738

3839
### Tokens ###
3940
Tokens are widely used in Hadoop to achieve lightweight authentication without compromising on security. Main motivation for using tokens inside Ozone is to prevent the unauthorized access while keeping the protocol lightweight and without sharing secret over the wire. Ozone utilizes three types of token:
4041

4142
#### Delegation token ####
43+
4244
Once client establishes their identity via kerberos they can request a delegation token from OzoneManager. This token can be used by a client to prove its identity until the token expires. Like Hadoop delegation tokens, an Ozone delegation token has 3 important fields:
4345

44-
Renewer: User responsible for renewing the token.
45-
Issue date: Time at which token was issued.
46-
Max date: Time after which token can’t be renewed.
46+
1. **Renewer**: User responsible for renewing the token.
47+
2. **Issue date**: Time at which token was issued.
48+
3. **Max date**: Time after which token can’t be renewed.
4749

4850
Token operations like get, renew and cancel can only be performed over an Kerberos authenticated connection. Clients can use delegation token to establish connection with OzoneManager and perform any file system/object store related operations like, listing the objects in a bucket or creating a volume etc.
4951

5052
#### Block Tokens ####
51-
Block tokens are similar to delegation tokens in sense that they are signed by OzoneManager. Block tokens are created by OM (OzoneManager) when a client request involves interaction with DataNodes such as read/write Ozone keys. Unlike delegation tokens there is no client API to request block tokens. Instead, they are handed transparently to client along with key/block locations. Block tokens are validated by Datanodes when receiving read/write requests from clients. Block token can't be renewed explicitly by client. Client with expired block token will need to refetch the key/block locations to get new block tokens.
53+
54+
Block tokens are similar to delegation tokens in sense that they are signed by OzoneManager. Block tokens are created by OM (OzoneManager) when a client request involves interaction with DataNodes such as read/write Ozone keys.
55+
56+
Unlike delegation tokens there is no client API to request block tokens. Instead, they are handed transparently to client along with key/block locations. Block tokens are validated by Datanodes when receiving read/write requests from clients. Block token can't be renewed explicitly by client. Client with expired block token will need to refetch the key/block locations to get new block tokens.
57+
5258
#### S3Token ####
59+
5360
Like block tokens S3Tokens are handled transparently for clients. It is signed by S3secret created by client. S3Gateway creates this token for every s3 client request. To create an S3Token user must have a S3 secret.
5461

5562
### Certificates ###
5663
Apart from kerberos and tokens Ozone utilizes certificate based authentication for Ozone service components. To enable this, SCM (StorageContainerManager) bootstraps itself as an Certificate Authority when security is enabled. This allows all daemons inside Ozone to have an SCM signed certificate. Below is brief descriptions of steps involved:
57-
Datanodes and OzoneManagers submits a CSR (certificate signing request) to SCM.
58-
SCM verifies identity of DN (Datanode) or OM via Kerberos and generates a certificate.
59-
This certificate is used by OM and DN to prove their identities.
60-
Datanodes use OzoneManager certificate to validate block tokens. This is possible because both of them trust SCM signed certificates. (i.e OzoneManager and Datanodes)
64+
65+
1. Datanodes and OzoneManagers submits a CSR (certificate signing request) to SCM.
66+
2. SCM verifies identity of DN (Datanode) or OM via Kerberos and generates a certificate.
67+
3. This certificate is used by OM and DN to prove their identities.
68+
4. Datanodes use OzoneManager certificate to validate block tokens. This is possible because both of them trust SCM signed certificates. (i.e OzoneManager and Datanodes)
6169

6270
## Authorization ##
63-
Ozone provides a pluggable API to control authorization of all client related operations. Default implementation allows every request. Clearly it is not meant for production environments. To configure a more fine grained policy one may configure Ranger plugin for Ozone. Since it is a pluggable module clients can also implement their own custom authorization policy and configure it using [ozone.acl.authorizer.class].
71+
Ozone provides a pluggable API to control authorization of all client related operations. Default implementation allows every request. Clearly it is not meant for production environments. To configure a more fine grained policy one may configure Ranger plugin for Ozone. Since it is a pluggable module clients can also implement their own custom authorization policy and configure it using `ozone.acl.authorizer.class`.
6472

6573
## Audit ##
74+
6675
Ozone provides ability to audit all read & write operations to OM, SCM and Datanodes. Ozone audit leverages the Marker feature which enables user to selectively audit only READ or WRITE operations by a simple config change without restarting the service(s).
76+
6777
To enable/disable audit of READ operations, set filter.read.onMatch to NEUTRAL or DENY respectively. Similarly, the audit of WRITE operations can be controlled using filter.write.onMatch.
6878

6979
Generating audit logs is only half the job, so Ozone also provides AuditParser - a sqllite based command line utility to parse/query audit logs with predefined templates(ex. Top 5 commands) and options for custom query. Once the log file has been loaded to AuditParser, one can simply run a template as shown below:
7080
ozone auditparser <path to db file> template top5cmds
7181

7282
Similarly, users can also execute custom query using:
83+
84+
```bash
7385
ozone auditparser <path to db file> query "select * from audit where level=='FATAL'"
86+
```
7487

7588
## Transparent Data Encryption ##
89+
7690
Ozone TDE setup process and usage are very similar to HDFS TDE. The major difference is that Ozone TDE is enabled at Ozone bucket level when a bucket is created.
7791

7892
To create an encrypted bucket, client need to
7993

8094
* Create a bucket encryption key with hadoop key CLI (same as you do for HDFS encryption zone key)
81-
```
95+
96+
```bash
8297
hadoop key create key1
8398
```
99+
84100
* Create an encrypted bucket with -k option
85-
```
101+
102+
```bash
86103
ozone sh bucket create -k key1 /vol1/ez1
87104
```
105+
88106
After that the usage will be transparent to the client and end users, i.e., all data written to encrypted bucket are encrypted at datanodes.
89107

90-
To know more about how to setup a secure Ozone cluster refer to [How to setup secure Ozone cluster]("SetupSecureOzone.md")
108+
To know more about how to setup a secure Ozone cluster refer to [How to setup secure Ozone cluster]({{< ref "SetupSecureOzone.md" >}})
109+
91110
Ozone [security architecture document](https://issues.apache.org/jira/secure/attachment/12911638/HadoopStorageLayerSecurity.pdf) can be referred for a deeper dive into Ozone Security architecture.

hadoop-hdds/docs/content/Prometheus.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ menu:
2121
limitations under the License.
2222
-->
2323

24-
[Prometheus](https://prometheus.io/) is an open-source monitoring server developed under under the [Cloud Native Foundation](Cloud Native Foundation).
24+
[Prometheus](https://prometheus.io/) is an open-source monitoring server developed under under the [Cloud Native Computing Foundation](https://www.cncf.io/).
2525

2626
Ozone supports Prometheus out of the box. The servers start a prometheus
2727
compatible metrics endpoint where all the available hadoop metrics are published in prometheus exporter format.
@@ -75,14 +75,14 @@ prometheus
7575

7676
http://localhost:9090/targets
7777

78-
![Prometheus target page example](../../prometheus.png)
78+
![Prometheus target page example](prometheus.png)
7979

8080

8181
(6) Check any metrics on the prometheus web ui. For example:
8282

8383
http://localhost:9090/graph?g0.range_input=1h&g0.expr=om_metrics_num_key_allocate&g0.tab=1
8484

85-
![Prometheus target page example](../../prometheus-key-allocate.png)
85+
![Prometheus target page example](prometheus-key-allocate.png)
8686

8787
## Note
8888

hadoop-hdds/docs/content/RunningViaDocker.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -44,16 +44,16 @@ including the data nodes and ozone services.
4444
ozone instance on your machine.
4545

4646
{{< highlight bash >}}
47-
cd ozone-0.2.1-SNAPSHOT/compose/ozone/
47+
cd compose/ozone/
4848

4949
docker-compose up -d
5050
{{< /highlight >}}
5151

52-
5352
To verify that ozone is working as expected, let us log into a data node and
5453
run _freon_, the load generator for Ozone. The ```exec datanode bash``` command
55-
will open a bash shell on the datanode. The ozone freon command is executed
56-
within the datanode container. You can quit freon via CTRL-C any time. The
54+
will open a bash shell on the datanode.
55+
56+
The `ozone freon` command is executed within the datanode container. You can quit freon via CTRL-C any time. The
5757
```rk``` profile instructs freon to generate random keys.
5858

5959
{{< highlight bash >}}

hadoop-hdds/docs/content/S3.md

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,16 +83,37 @@ Endpoint | Status | Notes
8383
------------------------------------|-----------------|---------------
8484
PUT Object | implemented |
8585
GET Object | implemented | Range headers are not supported
86-
Multipart Uplad | not implemented |
86+
Multipart Uplad | implemented |Except the listing of the current MultiPartUploads.
8787
DELETE Object | implemented |
8888
HEAD Object | implemented |
8989

9090

9191
## Security
9292

93-
Security is not yet implemented, you can *use* any AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
93+
If security is not enabled, you can *use* **any** AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
9494

95-
Note: Ozone has a notion for 'volumes' which is missing from the S3 Rest endpoint. Under the hood S3 bucket names are mapped to Ozone 'volume/bucket' locations (depending on the given authentication information).
95+
If security is enabled, you can get the key and the secret with the `ozone s3 getsecret` command (*kerberos based authentication is required).
96+
97+
```
98+
/etc/security/keytabs/testuser.keytab testuser/scm@EXAMPLE.COM
99+
ozone s3 getsecret
100+
awsAccessKey=testuser/scm@EXAMPLE.COM
101+
awsSecret=c261b6ecabf7d37d5f9ded654b1c724adac9bd9f13e247a235e567e8296d2999
102+
103+
```
104+
105+
Now, you can use the key and the secret to access the S3 endpoint:
106+
107+
```
108+
export AWS_ACCESS_KEY_ID=testuser/scm@EXAMPLE.COM
109+
export AWS_SECRET_ACCESS_KEY=c261b6ecabf7d37d5f9ded654b1c724adac9bd9f13e247a235e567e8296d2999
110+
aws s3api --endpoint http://localhost:9878 create-bucket --bucket bucket1
111+
```
112+
113+
114+
## S3 bucket name mapping to Ozone buckets
115+
116+
**Note**: Ozone has a notion for 'volumes' which is missing from the S3 Rest endpoint. Under the hood S3 bucket names are mapped to Ozone 'volume/bucket' locations (depending on the given authentication information).
96117

97118
To show the storage location of a S3 bucket, use the `ozone s3 path <bucketname>` command.
98119

hadoop-hdds/docs/content/S3Commands.md

Lines changed: 0 additions & 41 deletions
This file was deleted.

0 commit comments

Comments
 (0)