Skip to content

Commit dfce782

Browse files
Felix Hennigfhennig
andcommitted
docs: split up usage page and improve landing page (#344)
*Please add a description here. This will become the commit message of the merge request later.* Co-authored-by: Felix Hennig <fhennig@users.noreply.github.com>
1 parent 1f9c49f commit dfce782

File tree

14 files changed

+211
-237
lines changed

14 files changed

+211
-237
lines changed

docs/modules/hbase/images/hbase_overview.drawio.svg

Lines changed: 4 additions & 0 deletions
Loading

docs/modules/hbase/pages/getting_started/first_steps.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -200,4 +200,4 @@ This is because Phoenix requires these `SYSTEM.` tables for its own internal map
200200

201201
== What's next
202202

203-
Look at the xref:usage.adoc[Usage page] to find out more about configuring your HBase cluster.
203+
Look at the xref:usage-guide/index.adoc[] to find out more about configuring your HBase cluster.

docs/modules/hbase/pages/index.adoc

Lines changed: 32 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,41 @@
11
= Stackable Operator for Apache HBase
2+
:description: The Stackable Operator for Apache HBase is a Kubernetes operator that can manage Apache HBase clusters. Learn about its features, resources, dependencies, and demos, and see the list of supported HBase versions.
3+
:keywords: Stackable Operator, Apache HBase, Kubernetes, operator, engineer, CRD, StatefulSet, ConfigMap, Service, ZooKeeper, HDFS
24

3-
This is an operator for Kubernetes that can manage https://hbase.apache.org/[Apache HBase]
4-
clusters.
5+
This is an Operator for Kubernetes that manages https://hbase.apache.org/[Apache HBase] clusters.
6+
Apache HBase is an open-source, distributed, non-relational database that runs on top of the Hadoop Distributed File System (HDFS).
57

6-
WARNING: This operator is part of the Stackable Data Platform and only works with images from the
7-
https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fhbase[Stackable] repository.
8+
== Getting started
9+
10+
Follow the xref:getting_started/index.adoc[] guide to learn how to xref:getting_started/installation.adoc[install] the Stackable Operator for Apache HBase as well as the dependencies. The Guide will also show you how to xref:getting_started/first_steps.adoc[interact] with HBase running on Kubernetes by creating tables and some data using the REST API or Apache Phoenix.
11+
12+
The xref:usage-guide/index.adoc[] contains more information on xref:usage-guide/phoenix.adoc[] as well as other topics such as xref:usage-guide/resource-requests.adoc[CPU and memory configuration], xref:usage-guide/monitoring.adoc[] and xref:usage-guide/logging.adoc[].
13+
14+
== Operator model
15+
16+
The Operator manages the _HbaseCluster_ custom resource. You configure your HBase instance using this resource, and the Operator creates Kubernetes resources such as StatefulSets, ConfigMaps and Services accordingly.
17+
18+
HBase uses three xref:concepts:roles-and-role-groups.adoc[roles]: `masters`, `regionServers` and `restServers`.
19+
20+
image::hbase_overview.drawio.svg[A diagram depicting the Kubernetes resources created by the operator]
21+
22+
For every RoleGroup a **StatefulSet** is created. Each StatefulSet can contain multiple replicas (Pods).
23+
For every RoleGroup a **Service** is created, as well as one for the whole cluster that references the `regionServers`.
24+
For every Role and RoleGroup the Operator creates a **Service**.
25+
26+
A **ConfigMap** is created for each RoleGroup containing 3 files: `hbase-env.sh` and `hbase-site.xml` files generated from the HbaseCluster configuration (See xref:usage-guide/index.adoc[] for more information), plus a `log4j.properties` file used for xref:usage-guide/logging.adoc[].
27+
The Operator creates a **xref:usage-guide/discovery.adoc[discovery ConfigMap]** for the whole HbaseCluster a which contains information on how to connect to the HBase cluster.
28+
29+
== Dependencies
30+
31+
A distributed Apache HBase installation depends on a running Apache ZooKeeper and HDFS cluster. See the documentation for the xref:hdfs:index.adoc[Stackable Operator for Apache HDFS] how to set up these clusters.
32+
33+
== Demo
34+
35+
The xref:stackablectl::demos/hbase-hdfs-load-cycling-data.adoc[] demo shows how you can use HBase together with HDFS.
836

937
== Supported Versions
1038

1139
The Stackable Operator for Apache HBase currently supports the following versions of Apache HBase:
1240

1341
include::partial$supported-versions.adoc[]
14-
15-
== Getting the Docker image
16-
17-
[source]
18-
----
19-
docker pull docker.stackable.tech/stackable/hbase:<version>
20-
----
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1+
= Cluster operation
2+
:page-aliases: cluster_operations.adoc
13

2-
= Cluster Operation
3-
4-
HBase installations can be configured with different cluster operations like pausing reconciliation or stopping the cluster. See xref:concepts:cluster_operations.adoc[cluster operations] for more details.
4+
HBase installations can be configured with different cluster operations like pausing reconciliation or stopping the cluster. See xref:concepts:cluster_operations.adoc[cluster operations] for more details.

docs/modules/hbase/pages/discovery.adoc renamed to docs/modules/hbase/pages/usage-guide/discovery.adoc

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,11 @@
22
:namespace: \{namespace\}
33
:hdfs-cluster-name: \{hdfs-cluster-name\}
44
:zookeeper-znode-name: \{zookeeper-znode-name\}
5+
:page-aliases: discovery.adoc
56

67
= Discovery
78

8-
The Stackable Operator for Apache HBase publishes a discovery https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/#configmap-v1-core[`ConfigMap`], which exposes a client configuration bundle that allows access to the Apache HBase cluster.
9+
The Stackable Operator for Apache HBase publishes a discovery https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/#configmap-v1-core[ConfigMap], which exposes a client configuration bundle that allows access to the Apache HBase cluster.
910

1011
== Example
1112

@@ -23,16 +24,16 @@ spec:
2324
hdfsConfigMapName: {hdfs-cluster-name} #<3>
2425
zookeeperConfigMapName: {zookeeper-znode-name} #<4>
2526
----
26-
<1> The name of the HBase cluster, which is also the name of the created discovery `ConfigMap`.
27-
<2> The namespace of the discovery `ConfigMap`.
28-
<3> The `ConfigMap` name to discover the HDFS cluster.
29-
<4> The `ConfigMap` name to discover the ZooKeeper cluster.
27+
<1> The name of the HBase cluster, which is also the name of the created discovery ConfigMap.
28+
<2> The namespace of the discovery ConfigMap.
29+
<3> The ConfigMap name to discover the HDFS cluster.
30+
<4> The ConfigMap name to discover the ZooKeeper cluster.
3031

31-
The resulting discovery `ConfigMap` is located at `{namespace}/{cluster-name}`.
32+
The resulting discovery ConfigMap is located at `{namespace}/{cluster-name}`.
3233

3334
== Contents
3435

35-
The `ConfigMap` data values are formatted as Hadoop XML files which allows simple mounting of that `ConfigMap` into pods that require access to HBase.
36+
The ConfigMap data values are formatted as Hadoop XML files which allows simple mounting of that ConfigMap into pods that require access to HBase.
3637

3738
`hbase-site.xml`::
3839
Contains the `hbase.zookeeper.quorum` property.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
= Usage guide
2+
3+
Learn about xref:usage-guide/cluster-operations.adoc[starting, stopping and pausing] your cluster.
4+
5+
Learn about xref:usage-guide/pod-placement.adoc[configuring where Pods are scheduled] and xref:usage-guide/resource-requests.adoc[how many CPU and memory resources] your Pods consume.
6+
7+
You can observe what's happening with your HBase using xref:usage-guide/logging.adoc[logging] and xref:usage-guide/monitoring.adoc[monitoring].
8+
9+
Connect to HBase using xref:usage-guide/phoenix.adoc[Apache Phoenix] or use the xref:usage-guide/discovery.adoc[discovery ConfigMap] to connect other products.
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
= Log aggregation
2+
3+
The logs can be forwarded to a Vector log aggregator by providing a discovery
4+
ConfigMap for the aggregator and by enabling the log agent:
5+
6+
[source,yaml]
7+
----
8+
spec:
9+
clusterConfig:
10+
vectorAggregatorConfigMapName: vector-aggregator-discovery
11+
masters:
12+
config:
13+
logging:
14+
enableVectorAgent: true
15+
regionServers:
16+
config:
17+
logging:
18+
enableVectorAgent: true
19+
restServers:
20+
config:
21+
logging:
22+
enableVectorAgent: true
23+
----
24+
25+
Further information on how to configure logging, can be found in
26+
xref:home:concepts:logging.adoc[].
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
= Monitoring
2+
3+
The managed HBase instances are automatically configured to export Prometheus metrics. See
4+
xref:home:operators:monitoring.adoc[] for more details.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
2+
= Configuration overrides
3+
4+
The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
5+
6+
IMPORTANT: Overriding certain properties which are set by operator can interfere with the operator and can lead to problems.
7+
8+
== Configuration Properties
9+
10+
For a role or role group, at the same level of `config`, you can specify: `configOverrides` for the following files:
11+
12+
- `hbase-site.xml`
13+
- `hbase-env.sh`
14+
15+
NOTE: `hdfs-site.xml` is not listed here, the file is always taken from the referenced hdfs cluster. If you want to modify it, have a look at xref:hdfs:usage-guide/configuration-environment-overrides.adoc[HDFS configuration overrides].
16+
17+
For example, if you want to set the `hbase.rest.threads.min` to 4 and the `HBASE_HEAPSIZE` to two GB adapt the `restServers` section of the cluster resource like so:
18+
19+
[source,yaml]
20+
----
21+
restServers:
22+
roleGroups:
23+
default:
24+
config: {}
25+
configOverrides:
26+
hbase-site.xml:
27+
hbase.rest.threads.min: "4"
28+
hbase-env.sh:
29+
HBASE_HEAPSIZE: "2G"
30+
replicas: 1
31+
----
32+
33+
Just as for the `config`, it is possible to specify this at role level as well:
34+
35+
[source,yaml]
36+
----
37+
restServers:
38+
configOverrides:
39+
hbase-site.xml:
40+
hbase.rest.threads.min: "4"
41+
hbase-env.sh:
42+
HBASE_HEAPSIZE: "2G"
43+
roleGroups:
44+
default:
45+
config: {}
46+
replicas: 1
47+
----
48+
49+
All override property values must be strings. The properties will be formatted and escaped correctly into the XML file, respectively inserted as is into the `env.sh` file.
50+
51+
For a full list of configuration options we refer to the HBase https://hbase.apache.org/book.html#config.files[Configuration Documentation].
52+
53+
// Environment configuration is not implemented. The environment is managed
54+
// with the hbase-env.sh configuration file
55+
56+
// CLI overrides are also not implemented
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
= Using Apache Phoenix
2+
3+
The Apache Phoenix project provides the ability to interact with HBase with JBDC using familiar SQL-syntax. The Phoenix dependencies are bundled with the Stackable HBase image and do not need to be installed separately (client components will need to ensure that they have the correct client-side libraries available). Information about client-side installation can be found https://phoenix.apache.org/installation.html[here].
4+
5+
Phoenix comes bundled with a few simple scripts to verify a correct server-side installation. For example, assuming that phoenix dependencies have been installed to their default location of `/stackable/phoenix/bin`, we can issue the following using the supplied `psql.py` script:
6+
7+
[source,shell script]
8+
----
9+
/stackable/phoenix/bin/psql.py && \
10+
/stackable/phoenix/examples/WEB_STAT.sql && \
11+
/stackable/phoenix/examples/WEB_STAT.csv && \
12+
/stackable/phoenix/examples/WEB_STAT_QUERIES.sql
13+
----
14+
15+
This script creates a java command that creates, populates and queries a Phoenix table called `WEB_STAT`. Alternatively, one can use the `sqlline.py` script (which wraps the https://github.com/julianhyde/sqlline[sqlline] utility):
16+
17+
[source,shell script]
18+
----
19+
/stackable/phoenix/bin/sqlline.py [zookeeper] [sql file]
20+
----
21+
22+
The script opens an SQL prompt from where one can list, query, create and generally interact with Phoenix tables. So, to query the table that was created in the previous step, start the script and enter some SQL at the prompt:
23+
24+
image::phoenix_sqlline.png[Phoenix Sqlline]
25+
26+
The Phoenix table `WEB_STAT` is created as an HBase table, and can be viewed normally from within the HBase UI:
27+
28+
image::phoenix_tables.png[Phoenix Tables]
29+
30+
The `SYSTEM`* tables are those required by Phoenix and are created the first time that Phoenix is invoked.
31+
32+
NOTE: Both `psql.py` and `sqlline.py` generate a java command that calls classes from the Phoenix client library `.jar`. The Zookeeper quorum does not need to be supplied as part of the URL used by the JDBC connection string, as long as the environment variable `HBASE_CONF_DIR` is set and supplied as an element for the `-cp` classpath search: the cluster information is then extracted from `$HBASE_CONF_DIR/hbase-site.xml`.
33+

docs/modules/hbase/pages/pod_placement.adoc renamed to docs/modules/hbase/pages/usage-guide/pod-placement.adoc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
= Pod Placement
1+
= Pod placement
2+
:page-aliases: pod_placement.adoc
23

34
You can configure Pod placement for HDFS nodes as described in xref:concepts:pod_placement.adoc[].
45

@@ -92,5 +93,5 @@ affinity:
9293

9394
In the examples above `cluster-name` is the name of the HBase custom resource that owns this Pod. The `hdfs-cluster-name` is the name of the HDFS cluster that was configured in the `hdfsConfigMapName` property.
9495

95-
NOTE: It is important that the `hdfsConfigMapName` property contains the name the HDFS cluster. You could instead configure `ConfigMap`s of specific name or data roles, but for the purpose of pod placement, this will lead to faulty behavior.
96+
NOTE: It is important that the `hdfsConfigMapName` property contains the name the HDFS cluster. You could instead configure ConfigMaps of specific name or data roles, but for the purpose of pod placement, this will lead to faulty behavior.
9697

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
= Resource requests
2+
3+
include::home:concepts:stackable_resource_requests.adoc[]
4+
5+
If no resources are configured explicitly, the HBase operator uses following defaults:
6+
7+
[source,yaml]
8+
----
9+
regionServers:
10+
roleGroups:
11+
default:
12+
config:
13+
resources:
14+
cpu:
15+
min: '200m'
16+
max: "4"
17+
memory:
18+
limit: '2Gi'
19+
----
20+
21+
WARNING: The default values are _most likely_ not sufficient to run a proper cluster in production. Please adapt according to your requirements.
22+
23+
For more details regarding Kubernetes CPU limits see: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/[Assign CPU Resources to Containers and Pods].

0 commit comments

Comments
 (0)