Skip to content

Commit

Permalink
Multisite / multipart cluster - update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
romain-bellanger committed Aug 16, 2020
1 parent f803dac commit b839001
Show file tree
Hide file tree
Showing 2 changed files with 324 additions and 1 deletion.
51 changes: 50 additions & 1 deletion docs/Examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,55 @@ spec:
EOF
```

If different specs are needed for different parts of the indexer cluster, it is possible
to build an indexer cluster from multiple IndexerCluster resources referencing the same
cluster-master using the `indexerClusterRef` parameter.
For instance to define a size or StorageClass for the PersistentVolumes of the cluster-master
different from the indexers:

```yaml
cat <<EOF | kubectl apply -f -
apiVersion: enterprise.splunk.com/v1alpha2
kind: IndexerCluster
metadata:
name: example
finalizers:
- enterprise.splunk.com/delete-pvc
spec:
# Only create a cluster-master
replicas: 0
storageClassName: standard
varStorage: "4Gi"
---
apiVersion: enterprise.splunk.com/v1alpha2
kind: IndexerCluster
metadata:
name: example-part1
finalizers:
- enterprise.splunk.com/delete-pvc
spec:
# No cluster-master created, uses the referenced one
indexerClusterRef:
name: example
replicas: 3
storageClassName: local
varStorage: "128Gi"
EOF
```

When binding multiple IndexerCluster resources together this way, it is the part containing
the cluster-master which controls the [applications loaded](#installing-splunk-apps) to all
the parts of the indexer cluster, and the indexer services that it creates select the indexers
deployed by all the IndexerCluster parts, while the indexer services created by each other
part only select the indexers that it manages.

Separating the cluster-master from the indexers can also allow to better control
the upgrade cycle to respect the recommended order: cluster-master, then search-heads,
then indexers, by defining and updating the docker image used by each IndexerCluster part.

This solution can also be used to build a [multisite cluster](MultisiteExamples.md)
by defining a different zone affinity and site in each child IndexerCluster resource.

### Search Head Clusters

To scale search performance and provide high availability, customers will
Expand Down Expand Up @@ -657,4 +706,4 @@ spec:
secret:
secretName: splunk-cluster-master
defaultsUrl: /mnt/cluster-master/default.yml
```
```
274 changes: 274 additions & 0 deletions docs/MultisiteExamples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
# Configuring Splunk Enterprise Multisite Deployments

This document provides examples to configure a multisite cluster using the splunk-operator.

* [Multisite Indexer Clusters in Kubernetes](#multisite-indexer-clusters-in-kubernetes)
* [Multipart IndexerCluster](#multipart-indexercluster)
* [Single IndexerCluster resource and Downward API](#single-indexercluster-resource-and-downward-api)
* [Connecting a search-head cluster to a multisite indexer-cluster](#connecting-a-search-head-cluster-to-a-multisite-indexer-cluster)

Please refer to the [Configuring Splunk Enterprise Deployments Guide](Example.md)
for more information and examples about deploying the different Splunk resources
in single site mode.

## Multisite Indexer Clusters in Kubernetes

[Multisite indexer cluster architecture](https://docs.splunk.com/Documentation/Splunk/latest/Indexer/Multisitearchitecture)
can be used for various purposes. However, Kubernetes clusters are generally deployed at the scale of
a region, and the main purpose of this documentation is to cover the topic of building high-available
indexer clusters allowing to control the placement of bucket replicas across multiple
[availability zones](https://kubernetes.io/docs/setup/best-practices/multiple-zones/#introduction),
to prevent the failure of a single zone from causing events to be unavailable for search or permanently lost.

Designing applications requiring high-availability to support the loss of a zone is a recommendation
from all cloud providers (e.g. [GCP](https://cloud.google.com/solutions/scalable-and-resilient-apps),
[AWS](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/#Availability_Zones),
[Azure](https://docs.microsoft.com/en-us/azure/availability-zones/az-overview#availability-zones)).
In a private datacenter, multisite indexer clusters can be used to support the loss of a room or rack.
In the case of dedicated hardware with local storage used for Splunk (or various datastores),
a multisite indexer cluster allows to support regular maintenance (e.g. OS upgrades) in the case of
multiple indexer pods scheduled on the same host.

## Multipart IndexerCluster

Description: connect multiple IndexerCluster resources, each constrained to run within a dedicated zone and
configured with a hardcoded site.

Advantages:
- some operations are performed per site which mitigates the risk of impact on the whole cluster (e.g. Splunk upgrades, scaling up resources)
- specific indexer services are created per site allowing to send events to the indexers located in the same zone, avoiding possible cost of cross-zone traffic. Indexer discovery from cluster-master can do this for forwarders, but this solution also covers http/HEC traffic

#### Deploy the cluster-master

Note: the image version is defined in these resources as this allows to control the upgrade cycle

```yaml
cat <<EOF | kubectl apply -f -
---
kind: IndexerCluster
metadata:
name: example
finalizers:
- enterprise.splunk.com/delete-pvc
spec:
replicas: 0
image: "splunk/splunk:8.0.4"
defaults: |-
splunk:
site: site1
multisite_master: localhost
all_sites: site1,site2,site3
multisite_replication_factor_origin: 1
multisite_replication_factor_total: 2
multisite_search_factor_origin: 1
multisite_search_factor_total: 2
idxc:
search_factor: 2
replication_factor: 2
# Apps defined here are deployed to the indexers of all the sites
apps_location:
- "https://example.com/splunk-apps/app3.tgz"
EOF
```

#### Deploy the indexer sites

Site / zone mapping must be adapted for each site.

```yaml
cat <<EOF | kubectl apply -f -
---
kind: IndexerCluster
metadata:
name: example-site1
finalizers:
- enterprise.splunk.com/delete-pvc
spec:
replicas: 2
image: "splunk/splunk:8.0.4"
indexerClusterRef:
name: example
defaults: |-
splunk:
multisite_master: splunk-example-cluster-master-service
site: site1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- zone-1a
EOF
```

## Single IndexerCluster resource and Downward API

Description: indexers automatically identify the zone in which they are deployed and map it to a Splunk site.

Advantages:
- fewer Kubernetes resources to maintain / operate
- autoscaling easier to control and balance

Drawbacks:
- heavier configuration

#### Service account configuration

The default service account of the namespace running the cluster must be granted node-reader ClusterRole and pod-reader Role (Splunk CRDs currently don't allow to specify a different service account).

**Note:** the pod-reader role is needed as the IndexerCluster CRD doesn't provide a way to define environment variables on pods, which is the [only option to expose spec.nodeName](https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#capabilities-of-the-downward-api) through the Kubernetes downward API.

```yaml
cat <<EOF | kubectl apply -f -
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: splunk-node-reader-view-pods
namespace: splunk
subjects:
- kind: ServiceAccount
name: default
namespace: splunk
roleRef:
kind: Role
name: pod-reader
---
# Requires cluster admin priviledges
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: splunk-node-reader
subjects:
- kind: ServiceAccount
name: default
namespace: splunk
roleRef:
kind: ClusterRole
name: system:node-reader
EOF
```

#### Installing the openshift python module in the docker image

This is the client library used by the [ansible k8s lookup](https://docs.ansible.com/ansible/latest/plugins/lookup/k8s.html#requirements) (lookup not specific to openshift).

```
FROM splunk/splunk:8.0.4
USER root
RUN pip2 install openshift
USER ansible
```

#### IndexerCluster definition

The namespace and name properties of the pods are exposed through volumes using the Kubernetes downward API (as spec.nodeName can only be exposed through environment variable, which is not yet supported by the IndexerCluster CRD).

Note: multipart cluster could still be used there to control the cluster-master independently from the indexers.

```yaml
cat <<EOF | kubectl apply -f -
---
apiVersion: enterprise.splunk.com/v1alpha2
kind: IndexerCluster
metadata:
name: example
finalizers:
- enterprise.splunk.com/delete-pvc
spec:
# The custom image built at previous step
image: "splunk/splunk:8.0.4_openshift_client"
volumes:
- name: podinfo
downwardAPI:
# type has to be defined because of bug in OCP 3.11: https://github.com/kubernetes/kubernetes/issues/68466
# Because of the same bug IndexerCluster deletion also ends up in error when downwardAPI is used and finalizer
# enterprise.splunk.com/delete-pvc is defined. Removing finalizer allows deleting the idxc.
type: array
# WARNING: reconciliation permanently fails if defaultMode or apiVersion in items not specified
# default value are automatically added in k8s, which causes the operator to always find diff
defaultMode: 420
items:
- path: pod
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- path: namespace
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
defaults: |-
# The mapping must be configured with the names of the zones or your Kubernetes cluster
zone_site_map:
zone-1a: site1
zone-1b: site2
zone-1c: site3
k8s_token: "{{ lookup('file', '/var/run/secrets/kubernetes.io/serviceaccount/token') }}"
k8s_namespace: "{{ lookup('file', '/mnt/podinfo/namespace') }}"
k8s_pod_name: "{{ lookup('file', '/mnt/podinfo/pod') }}"
k8s_node_name: "{{ lookup('k8s', api_key=k8s_token, kind='Pod', namespace=k8s_namespace, resource_name=k8s_pod_name).spec.nodeName }}"
# In latest version this label is being replaced with topology.kubernetes.io/zone
k8s_zone: "{{ lookup('k8s', api_key=k8s_token, kind='Node', resource_name=k8s_node_name).metadata.labels['failure-domain.beta.kubernetes.io/zone'] }}"
splunk:
# As this configuration is shared between master and indexers:
# The cluster-master pod can't reach it's own API through the cluster-master Service before the pod is flagged as ready
# so at setup time, ansible must use localhost on cluster-master
multisite_master: "{{ (lookup('env', 'SPLUNK_ROLE') == 'splunk_cluster_master') | ternary('localhost', 'splunk-example-cluster-master-service') }}"
site: "{{ zone_site_map[k8s_zone] }}"
all_sites: site1,site2,site3
multisite_replication_factor_origin: 1
multisite_replication_factor_total: 2
multisite_search_factor_origin: 1
multisite_search_factor_total: 2
idxc:
search_factor: 2
replication_factor: 2
EOF
```

## Connecting a search-head cluster to a multisite indexer-cluster

[Search head clusters do not have site awareness](
https://docs.splunk.com/Documentation/Splunk/latest/DistSearch/DeploymultisiteSHC#Search_head_clusters_do_not_have_site_awareness)
for artifact replication, so mapping Splunk sites to Kubernetes zones is not relevant in that context.

SearchHeadCluster resources can be connected to a multisite indexer cluster the same way as for single site.
The name of the IndexerCluster part containing the cluster master must be referenced in parameter `indexerClusterRef`.

Additional ansible default parameters must be set to activate multisite:
* `multisite_master`: which should reference the cluster-master service of the target indexer cluster
* `site`: which should in general be set to `site: site0` to disable search affinity ([documentation for more details]
(https://docs.splunk.com/Documentation/Splunk/latest/DistSearch/DeploymultisiteSHC#Integrate_a_search_head_cluster_with_a_multisite_indexer_cluster))

```yaml
cat <<EOF | kubectl apply -f -
---
kind: SearchHeadCluster
metadata:
name: example
finalizers:
- enterprise.splunk.com/delete-pvc
spec:
replicas: 3
image: "splunk/splunk:8.0.4"
indexerClusterRef:
name: example
defaults: |-
splunk:
multisite_master: splunk-example-cluster-master-service
site: site0
EOF
```

0 comments on commit b839001

Please sign in to comment.