This repo utilizes the always free tier of the oracle cloud to provision a kubernetes cluster. In its current state, i just pay a few cents for dns management (which you might get for free on cloudflare).
The oracle kubernetes controlplane (oke) is free to use, you just pay
for the worker nodes, if you surpass the always free tier (which we don't).
You get 4 oCpus and 24GB memory which are split into two worker-instances
(VM.Standard.A1.Flex
), allowing good resource utilization.
The boot partions are 100Gb each, so longhorn
can use around 60GB as in-cluster
storage. For the ingress class we use nginx
with the oracle flexible
LB (10Mbps), because that's free as well.
Currently it might be tricky to get a free-tier account, but there a several guides on reddit to overcome the account restrictions.
The initial infra setup is inspired by this great tutorial: https://arnoldgalovics.com/free-kubernetes-oracle-cloud/
⚠️ This project uses arm instances, no x86 architecture
This repo hosts my personal stuff and is a playground for my kubernetes tooling.
Tip
In case you want to reproduce my oke
setup, you might find this guide -
by my coworker - more helpful.
- K8s control plane
- Worker Nodes
- Ingress nginx-ingress controller
- Certmanager with letsencrypt
- External DNS with sync to the oci dns management
- Dex as OIDC Provider with github as idP
- ArgoCD with Dex Login
- Storage with longhorn (rook/ceph & piraeus didnt work out)
- Grafana with Dex Login
- kube-Prometheus/Alertmanager-stack
- Prometheus Metrics Adapter
- Kyverno and Image Signing
Note
I've recently updated the backend.s3
config, to work with terraform 1.6
This setup uses terraform to manage the oci and kubernetes part.
- terraform
- oci-binary
The terraform state is pushed to oracle object storage (free as well). For that we have to create a bucket initially:
$ oci os bucket create --name terraform-states --versioning Enabled --compartment-id xxx
- The infrastructure (everything to a usable k8s-api endpoint) is managed by terrafom in infra
- The k8s-modules (OCI specific config for dns/secrets etc.) are managed by terraform in config
These components are independed from eachother, but obv. the infra should be created first.
For the config part, we need to add a private *.tfvars
file:
compartment_id = "ocid1.tenancy.zzz"
Running the config
section you need more variables, which either get output
by the infra
-run or have to be extracted from the webui.
As i've switched to flux, you also need a personal GH access token in there.
As i'm opposed to store any secrets in git (encrypted or not), i rely on external-secrets to propagate them to the cluster.
To generate an Secret
with the auth information for the oracle vault, we've to run:
# inside infra
k --kubeconfig ~/.kube/oci.kubeconfig -n external-secrets create secret generic oracle-vault --from-literal=privateKey="$(terraform output --raw external_secrets_api_private_key)" --from-literal=fingerprint="$(terraform output --raw external_secrets_fingerprint)"
- The infrastructure (everything to a usable k8s-api endpoint) is managed by terrafom in infra
- The k8s-modules (usually helm) are managed by terraform in config
These components are independed from eachother, but obv. the infra should be created first.
For the config part, we need to add a private *.tfvars
file:
compartment_id = "ocid1.tenancy.zzz"
- The first & second value are outputs from the infra-terraform.
- The third & fourth value are extracted from the webui
With the following command we get the kubeconfig for terraform/direct access:
# in the infra folder
oci ce cluster create-kubeconfig --cluster-id $(terraform output --raw k8s_cluster_id) --file ~/.kube/oci.kubeconfig --region eu-frankfurt-1 --token-version 2.0.0 --kube-endpoint PUBLIC_ENDPOINT
k --kubeconfig ~/.kube/oci.kubeconfig exec -n teleport -ti deployment/teleport-cluster-auth -- tctl users add nce --roles=access,editor,auditor
I mostly skipped 1.27.2
& 1.28.2
(on the workers) and went for the 1.29
release. As the UI didn't
prompt for a direct upgrade path of the control-plane, i upgraded the k8s-tf
version to the prompted next release, ran the upgrade, and continued with the next version.
The worker nodes remained at 1.26.7
during the oke upgrade, which worked because with 1.28
the new skey policy allows for worker nodes to be three versions behind.
PSP
s first
- Upgrade the nodepool & cluster version by setting the k8s variable; Run terrafrom (takes ~10min)
- Drain/Cordon worker01
- Go to the UI; delete the worker01 from the nodepool
- Scale the Nodepool back to 2 (takes ~10min)
- Wait for longhorn to sync (no volume in state
degraded
) - repeat for second node (2-5)
The 1.23.4 -> 1.24.1 Kubernetes Upgrade went pretty smooth, but by hand.
I followed the official guide:
- https://docs.oracle.com/en-us/iaas/Content/ContEng/Tasks/contengupgradingk8smasternode.htm
- https://docs.oracle.com/en-us/iaas/Content/ContEng/Tasks/contengupgradingk8sworkernode.htm
Longhorn synced all volumes after the new node got ready. No downtime experienced.