prerequisites: kubectl installed
- Register an account in www.okteto.com
- Install
okteto-cli
: https://www.okteto.com/docs/getting-started/#installing-okteto-cli - Configure
okteto-cli
: https://www.okteto.com/docs/getting-started/#configuring-okteto-cli-with-okteto-cloud - Get okteto kubeconfig: https://www.okteto.com/docs/getting-started/#configuring-okteto-cli-with-okteto-cloud
- Set namespace value to your okteto namespace name in
overlays/demo-okteto/kustomization.yaml
- run
kubectl apply -k overlays/demo-okteto
Once deployment is done, public components will be available under following urls:
https://alertmanager-ui-{{ your namespace name }}.cloud.okteto.net/
https://grafana-{{ your namespace name }}.cloud.okteto.net/
prerequisites: kubectl installed
- Add namespace manifest to the resources
- Make sure namespace is specified in your
kustomization.yaml
- Get you cluster kubeconfig
- Override grafana and alertmanager-ui service manifests to expose it to consumers in a preferable way (i.e. NodePort)
- run
kubectl apply -k overlays/your-overlay-folder
Components:
- grafana: UI, alerting
- mysql8: Grafana database
- prometheus: metric collection, alerting
- prometheus-alertmanager: alerting
- karma-ui: alertmanager ui service
- loki: log collection
- grafana-agent: metrics collection, log forwarding, log collection
- vector: metrics collection, log forwarding, log collection
Known issues: due to okteto restrictions, ClusterRoles can't be created, so log scraping from k8s pod stdout won't work, this is a subject to customize for on-premises setup. General guideline would be to replace collector role defined in base/roles/prometheus-k8s-sd.yaml
with a ClusterRole and create ClusterRoleBinding for grafana-agent or vector ServiceAccounts; in this case kubernetes logs config can be enabled in those services;
All components are deployed in a single instance, because there's a limitation on a number of pods allowed to run in demo plan.
Since the goal was a bit abstract, I was struggling finding balance between execution time/effort and demonstration smoothness, since many quick solutions would render themselves barely portable and potentially can run into numerous environment issues when reproduced by the other party. So my initial goal was to keep it simple and portable. One of biggest roadblocks I expected to run into is specifically platform integration: how do we authenticate and execute deployment pipeline against the platform of choice. I was looking for smth with direct access to k8s control plain, and as I expected there were not too many free or even trial options (which I still don't want to subscribe to). If that was not the case, I might have required to learn some platfrom-specific APIs to fullfill that task. Luckily, there was an option, so I could cut down prerequisites to pretty much having kubectl installed and kubeconfig present on a machine to execute deployment against k8s cluster.
Once I got that off my way, I could focus on platform components configuration and wiring. The main idea was to bring up a standard LGTM stack (without tracing part since it was not mentioned in task description) and demonstrate main components provisioned and operating. So there was a list of things to do:
- Configure Grafana with provisioned datasources, demo dashboard and custom administrator password;
- Configure prometheus with some predefined stack alerts;
- Configure prometheus scrapes for pods based on annotations;
- Configure alertmanager and alertmanager-ui as a slight metamonitoring touch: setup always firing alert rule, if it's not there, prometheus is down;
- Configure grafana to send logs to loki via vector syslog forwarder - because we're not permitted read node files and we still want to see logs in our feed;
This stack would bring a good half of "automatic" observability to every pod exposing metrics and annotated with specific annotations. If we had access to Node API scope, we would also have automatic log collection for all pods.
The project is implemented in Kustomize. It allows to solve main customization points in context of the demo:
- configure namespace to deploy to;
- configure grafana admin password;
additionally it allows to override arbitrary configuration parts, as an example:
- grafana db credentials;
- prometheus alert rule files;
namespace: equinsuocha # set your namespace name here before deploying the project
secretGenerator:
- name: grafana-db-grafana-credentials
literals:
- username=grafana
- password=grafana-secret
- name: grafana-admin-credentials
literals:
- username=admin # set grafana administrator username here
- password=what$0ever # set grafana administrator password here
resources:
- ../../base/
configMapGenerator:
- name: prometheus-alert-rules
behavior: replace
files: # list your alert rule files here
- ./app-config/alertrules.yaml
Kustomize works for most of scenarios, i.e. for on-prem deployment Roles and RoleBindings could be replaced with ClusterRoles and ClusterRoleBindings to allow automatic log collection could have been implemented relatively easy.
However, the tradeoff behind this solution is that is does not embed any config templating, which results into some wonky things, like specifying plain-text db credentials in mysql-init file. This could have been solved by invoking templating engine before customize, but I found it unnecessary for the sake of demo, as it would require me to either move that wonkyness to other level (pass credentials from elsewhere) or provide centralized secret store for this project which is completely decoupled from target environment.