- Overview
- Supported Cloud Platforms
- Sample Cluster Pool
- Sample Cluster Claim
- Managing admins for Cluster Pools
- Install Config Template
- Time-based scaling of Cluster Pool
- ClusterPool Deletion
- Troubleshooting
Hive exposes a ClusterPool
API which allows users to maintain a pool of "hot"
precreated ClusterDeployments
, ready to be claimed when needed. The pool size
can be configured and Hive will attempt to maintain that set number of
clusters.
When a user needs a cluster they create a ClusterClaim
resource which will be
filled with details on where to find their cluster as soon as
one is available from the pool and running. Once
claimed, a cluster is removed from the pool and a new one will be created to
replace it. Claimed clusters never return to the pool, they are intended to be
destroyed when no longer needed. The ClusterClaim.Spec.Namespace
will be
populated once the claim has been filled, and the ClusterDeployment
will be
present in that namespace with an identical name.
ClusterPools
are a namespaced resource, and can be used to centralize billing
by using a namespace limited to a team via Kubernetes RBAC. All clusters in the
pool will use the same set of cloud credentials specified in the platform for
the pool. ClusterClaims
must be created in the same namespace as their
ClusterPool
, but each actual ClusterDeployment
is given its own namespace.
The user who claims a cluster can be given RBAC to their cluster's namespace to
prevent anyone else from being able to access it.
By default once a ClusterDeployment
is ready, it will be
hibernated automatically. Once claimed it will be
automatically resumed, meaning that the typical time to claim a cluster and be
ready to go is in the 2-5 minute range while the cluster starts up. You can
keep a subset of clusters active by setting ClusterPool.Spec.RunningCount
;
such clusters will be ready immediately when claimed.
When done with a cluster, users can just delete their ClusterClaim
and the
ClusterDeployment
will be automatically deprovisioned. An optional
ClusterClaim.Spec.Lifetime
can be specified after which a cluster claim will
automatically be deleted. The namespace created
for each cluster will eventually be cleaned up once deprovision has finished.
Note that at present, the shared credentials used for a pool will be visible in-cluster. This may improve in the future for some clouds.
ClusterPool
currently supports the following cloud platforms:
- AWS
- Azure
- GCP
- OpenStack (with inventory)
apiVersion: hive.openshift.io/v1
kind: ClusterPool
metadata:
name: openshift-46-aws-us-east-1
namespace: my-project
spec:
baseDomain: new-installer.openshift.com
imageSetRef:
name: openshift-4.6
platform:
aws:
credentialsSecretRef:
name: hive-team-aws-creds
region: us-east-1
pullSecretRef:
name: hive-team-pull-secret
runningCount: 1
size: 3
apiVersion: hive.openshift.io/v1
kind: ClusterClaim
metadata:
name: dgood46
namespace: my-project
spec:
clusterPoolName: openshift-46-aws-us-east-1
lifetime: 8h
namespace: openshift-46-aws-us-east-1-j495p # populated by Hive once claim is filled and should not be set by the user on creation
status:
conditions:
- lastProbeTime: "2020-11-05T14:49:26Z"
lastTransitionTime: "2020-11-05T14:49:26Z"
message: Cluster claimed
reason: ClusterClaimed
status: "False"
type: Pending
Role bindings in the namespace of a ClusterPool
that bind to the Cluster Role hive-cluster-pool-admin
are used to provide the subjects same permission in the namespaces created for various clusterprovisions for the cluster pool.
This allows operators to define adminstrators for a ClusterPool
allowing them visibility to all the resources created for it. This is
most useful to debug ClusterProvisions
associated with the pool that have failed and therefore cannot be claimed.
NOTE: You can only define such administrators for the entire namespace and not a specific ClusterPool
.
To make any User
or Group
hive-cluster-pool-admin
for a namespace you can,
oc -n <namespace> adm policy add-role-to-group hive-cluster-pool-admin <user>
or,
oc -n <namespace> adm policy add-role-to-group hive-cluster-pool-admin <group>
To control parts of the cluster deployments that are not directly supported by Hive, such as controlPlane Nodes and types, you can load a valid install-config.yaml
which will be passed directly to the openshift-installer, only updating metadata.name
and baseDomain
Load the install-config.yaml template as a secret (assuming that the install-config.yaml
you want to use as a template is in the active directory)
kubectl -n my-project create secret generic my-install-config-template --from-file=install-config.yaml=./install-config.yaml
With this secret created, you can create a pool that references the install config secret template. The pool and secret must be in the same namespace.
apiVersion: hive.openshift.io/v1
kind: ClusterPool
metadata:
name: openshift-46-aws-us-east-1
namespace: my-project
spec:
baseDomain: hive.mytests.io
imageSetRef:
name: openshift-v4.5.13
installConfigSecretTemplateRef:
name: my-install-config-template
skipMachinePools: true
platform:
aws:
credentialsSecretRef:
name: global-aws-creds
region: eu-west-1
size: 1
Note When using ClusterPools, Hive will by default create a MachinePool for the worker nodes for any ClusterDeployments that are a child of a ClusterPool. When you use an installConfigSecretTemplate that deviates from the MachinePool defaults you will most likely want to disable MachinePools by setting spec.skipMachinePools on the ClusterPool, so that Hive does not reconcile away from the machine config specified in install-config.yaml
You can use kubernetes cron jobs to scale clusterpools as per a defined schedule.
The following are the yaml configurations for setting up the permissions: Role, RoleBinding and ServiceAccount. It sets up a role with permissions to get a clusterpool and patch clusterpool’s scale subresource.
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: my-project
name: scale-clusterpool
rules:
- apiGroups:
- hive.openshift.io
resources:
- clusterpools
verbs:
- get
- apiGroups:
- hive.openshift.io
resources:
- clusterpools/scale
verbs:
- patch
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: scale-clusterpool
namespace: my-project
subjects:
- kind: ServiceAccount
name: sa-scale-clusterpool
namespace: my-project
roleRef:
kind: Role
name: scale-clusterpool
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: sa-scale-clusterpool
namespace: my-project
Below is the sample configuration for the CronJob to scale up a clusterpool to size 10 at 6:00 AM everyday. It uses the serviceAccountName sa-scale-clusterpool
created above.
apiVersion: batch/v1
kind: CronJob
metadata:
name: scale-up-clusterpool
namespace: my-project
spec:
schedule: "0 6 * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: sa-scale-clusterpool
containers:
- name: scale-clusterpool-size-10
image: quay.io/openshift/origin-cli:latest
command:
- /bin/sh
- -c
- oc scale clusterpool openshift-46-aws-us-east-1 -n my-project --replicas=10
restartPolicy: OnFailure
Below is the sample configuration for the CronJob to scale down a clusterpool to size 0 at 20:00 (8:00 PM) everyday. It uses the serviceAccountName sa-scale-clusterpool
created above.
apiVersion: batch/v1
kind: CronJob
metadata:
name: scale-down-clusterpool
namespace: my-project
spec:
schedule: "0 20 * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: sa-scale-clusterpool
containers:
- name: scale-clusterpool-size-0
image: quay.io/openshift/origin-cli:latest
command:
- /bin/sh
- -c
- oc scale clusterpool openshift-46-aws-us-east-1 -n my-project --replicas=0
restartPolicy: OnFailure
CronJob’s spec.schedule field can be used to set the exact time when you want to scale the clusterpool. The syntax of the schedule expects a cron expression made of five fields - minute (0 - 59), hour (0 - 23), day of the month (1 - 31), month (1 - 12) and day of the week (0 - 6) in that order. In our example CronJob to scale up a clusterpool, the schedule is set to 0 6 * * *
which is 6:00 AM everyday. The cron job controller uses the time set for the kube-controller-manager container.
CronJob’s spec.containers[].image is the image with the oc
binary. We have tested with the quay.io/openshift/origin-cli image. You can also create your own image.
A ClusterPool
can be deleted in the usual way (oc delete
or the API equivalent).
When a ClusterPool
is deleted, hive will automatically initiate deletion of all unclaimed clusters in the pool.
No new clusters will be created, and any new ClusterClaim
s will not be fulfilled.
However, existing claimed clusters will not be affected; and the ClusterPool
itself will be held extant until those clusters have been deprovisioned through the normal means -- i.e. by deleting their ClusterClaim
s.
See this doc.