Skip to content

delphidigital/bare-metal-cluster-manager

Repository files navigation

Hetzner Bare Metal k8s Cluster

The scripts in this repository will setup and maintain one or more kubernetes clusters consisting of dedicated Hetzner servers. Each cluster will also be provisioned to operate as a node in the THORCHain network.

Executing the scripts in combination with some manual procedures will get you highly available, secure clusters with the following features on bare metal.

Preparations

Servers

Acquire a couple of servers as the basis for a cluster (AX41-NVME's are working well for instance). Visit the admin panel and name the servers appropriately.

tc-k8s-node1
tc-k8s-node2
tc-k8s-node3
...

tc-k8s-master1
tc-k8s-master2
tc-k8s-worker1
tc-k8s-worker2
tc-k8s-worker3
...

Refer to the reset procedure to properly initialize them.

vSwitch

Create a vSwitch and order an appropriate subnet (it may take a while to show up after the order). Give the vSwitch a name (i.e. tc-k8s-net) and assign this vSwitch to the servers.

Checkout the docs for help.

Usage

Clone this repository, cd into it and download kubespray.

git submodule init && git submodule update

Create a Python virtual environment or similar.

# Optional
virtualenv -p python3 venv

Install dependencies required by Python and Ansible Glaxy.

pip install -r requirements.python.txt
ansible-galaxy install -r requirements.ansible.yml

Note: Mitogen does not work with ansible collections and the strategy must be changed (i.e. strategy: linear).

Provisioning

Create a deployment environment inventory file for each cluster you want to manage.

cp hosts.example inventory/production.yml
cp hosts.example inventory/test.yml
cp hosts.example inventory/environment.yml
...

cp hosts.example inventory/production-01.yml
cp hosts.example inventory/production-02.yml
...

cp hosts.example inventory/production-helsinki.yml
cp hosts.example inventory/whatever.yml

Edit the inventory file with your server ip's and network information and customize everything to your needs.

# Manage a cluster
ansible-playbook cluster.init.yml -i inventory/environment.yml
ansible-playbook --become --become-user=root kubespray/cluster.yml -i inventory/environment.yml
ansible-playbook cluster.finish.yml -i inventory/environment.yml

# Run custom playbooks
ansible-playbook private-cluster.yml -i inventory/environment.yml
ansible-playbook private-test-cluster.yml -i inventory/environment.yml
ansible-playbook private-whatever-cluster.yml -i inventory/environment.yml

Check this out for more playbooks on cluster management.

THORChain

In order for the cluster to operate as a node in the THORCHain network deploy as instructed here. You can also refer to the node-launcher repository, if necessary, or the THORChain documentation as a whole.

Resetting the bare metal servers

This will install and use Ubuntu 20.04 on only one of the two internal NVMe drives. The unused ones will be used for persistent storage with ceph/rook. You can check the internal drive setup with lsblk. Change it accordingly in the command shown above when necessary.

Manually

Visit the console and put each server of the cluster into rescue mode. Then execute the following script.

installimage -a -r no -i images/Ubuntu-2004-focal-64-minimal.tar.gz -p /:ext4:all -d nvme0n1 -f yes -t yes -n hostname

Automatically

Create a pristine state by running the playbooks in sequence.

ansible-playbook server.rescue.yml -i inventory/environment.yml
ansible-playbook server.bootstrap.yml -i inventory/environment.yml

Instantiation

Instantiate the servers.

ansible-playbook server.instantiate.yml -i inventory/environment.yml

Administration

Adding worker nodes to the cluster

Register the new node(s) in your existing inventory file and run the scale command.

Keep your network limitations in mind. You can use a maximum of 4 nodes for a /29 subnet. Prepare accordingly if necessary.

node4:
  ansible_host: 44.55.66.77
  etcd_member_name: node4
  ip: 10.10.10.14
node5:
  ansible_host: 55.66.77.88
  etcd_member_name: node5
  ip: 10.10.10.15

kube-node:
  hosts:
    node1: {}
    node2: {}
    node3: {}
    node4: {}
    node5: {}
ansible-playbook --become --become-user=root kubespray/scale.yml -i inventory/environment.yml

Removing worker nodes from the cluster

ansible-playbook --become --become-user=root -e node=node4 kubespray/remove-node.yml -i inventory/environment.yml

File system

Deploy, use and remove the Rook Toolbox.

ansible-playbook cluster.toolbox.yml -i inventory/environment.yml

kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

ansible-playbook cluster.toolbox.yml -e state=absent -i inventory/environment.yml

About

Cluster manager for Thornode

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published