Kanister is a framework that enables application-level data management on Kubernetes. It allows domain experts to capture application specific data management tasks via Blueprints, which can be easily shared and extended. The framework takes care of the tedious details surrounding execution on Kubernetes and presents a homogeneous operational experience across applications at scale.
The design of Kanister was driven by the following main goals:
-
Application-Centric: Given the increasingly complex and distributed nature of cloud-native data services, there is a growing need for data management tasks to be at the application level. Experts who possess domain knowledge of a specific application's needs should be able to capture these needs when performing data operations on that application.
-
API Driven: Data management tasks for each specific application may vary widely, and these tasks should be encapsulated by a well-defined API so as to provide a uniform data management experience. Each application expert can provide an application-specific pluggable implementation that satisfies this API, thus enabling a homogeneous data management experience of diverse and evolving data services.
-
Extensible: Any data management solution capable of managing a diverse set of applications must be flexible enough to capture the needs of custom data services running in a variety of environments. Such flexibility can only be provided if the solution itself can easily be extended.
This README provides the basic set of information to get up and running with Kanister. For further information, please refer to the Kanister Documentation.
The following commands will install Kanister, Kanister-enabled MySQL and backup to an AWS S3 bucket.
# Add Kanister Charts
helm repo add kanister http://charts.kanister.io
# Install the Kanister Controller
helm install --name myrelease --namespace kanister kanister/kanister-operator
# Configure access to an S3 Bucket
helm install kanister/profile \
--name profile --namespace kanister \
--set defaultProfile=true \
--set s3.bucket="my-kanister-bucket" \
--set s3.region="us-west-2" \
--set s3.accessKey="${AWS_ACCESS_KEY_ID}" \
--set s3.secretKey="${AWS_SECRET_ACCESS_KEY}"
# Install MySQL and configure its Kanister Blueprint.
helm install kanister/kanister-mysql \
--name mysql-release --namespace mysql-ns \
--set kanister.controller_namespace=kanister
# Perform a backup by creating an ActionSet
cat << EOF | kubectl create -f -
apiVersion: cr.kanister.io/v1alpha1
kind: ActionSet
metadata:
name: mysql-backup-june-1
namespace: kanister
spec:
actions:
- name: backup
blueprint: mysql-release-kanister-mysql-blueprint
object:
kind: Deployment
name: mysql-release-kanister-mysql
namespace: mysql-ns
profile:
name: default-profile
namespace: kanister
EOF
# Restore the Backup we just created
$ kanctl --namespace kanister create actionset --action restore --from mysql-backup-june-1
In order to use Kanister, you will need to have the following set up:
- Kubernetes version 1.8 or higher
- kubectl
- Helm
Kanister is based on the operator pattern. The first step to using Kanister is to deploy the Kanister controller. The Kanister controller can be configured and installed using Helm.
In order to get the latest version of the Kanister Controller, we recommend installing it from our charts repo: http://charts.kanister.io.
In addition to our own repo, we've added the controller to the default Helm
repo named stable
when helm is installed. This version of the chart is
found here and can
be installed by specifying the stable
chart prefix.
helm install --name myrelease --namespace kanister stable/kanister-operator --set image.tag=0.12.0
Warning: The Kanister chart found in
stable
may be behind the one in the Kanister repo. We therefore reccommend using the Kanister repo.
If you wish to build and deploy the controller from source, instructions to do so can be found here.
We see the components installed as well as their status using Helm.
$ helm status myrelease
LAST DEPLOYED: Wed Mar 21 16:40:43 2018
NAMESPACE: kanister
STATUS: DEPLOYED
RESOURCES:
==> v1/ServiceAccount
NAME SECRETS AGE
myrelease-kanister-operator 1 9s
==> v1beta1/ClusterRole
NAME AGE
myrelease-kanister-operator-cluster-role 9s
==> v1beta1/ClusterRoleBinding
NAME AGE
myrelease-kanister-operator-edit-role 9s
myrelease-kanister-operator-cr-role 9s
==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
myrelease-kanister-operator 1 1 1 1 9s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
myrelease-kanister-operator-1484730505-2s279 1/1 Running 0 9s
...
To check the status of the controller's pod:
# Check the pod's status.
$ kubectl --namespace kanister get pod -l app=kanister-operator
NAME READY STATUS RESTARTS AGE
kanister-operator-2733194401-l79mg 1/1 Running 1 12m
The Kanister controller will create CRDs on startup if they don't already exist. We can verify that they exist:
$ kubectl get crd
NAME AGE
actionsets.cr.kanister.io 30m
blueprints.cr.kanister.io 30m
profiles.cr.kanister.io 30m
As shown above, two custom resources are defined - Blueprints and ActionSets. A Blueprint specifies a set of actions that can be executed on an application. An ActionSet provides the necessary runtime information to trigger taking an action on the application.
Since Kanister follows the operator pattern, other useful kubectl commands work with the Kanister controller as well, such as fetching the logs:
$ kubectl --namespace kanister logs -l app=kanister-operator
In addition to a Kubernetes controller, Kanister also includes a command line
tool called kanctl
used to help manage Kanister CRDs. The kanctl
binary can be
downloaded from the release page.
Alternatively, you can also install kanctl
from source if you have go setup
locally.
$ go install -v github.com/kanisterio/kanister/cmd/kanctl
Kanister's access to external services is configured through Profiles. Like the
controller, Profiles are installed through Helm from the Kanister chart repo. To
install a Profile to configure access to an S3 bucket called my-kanister-bucket
,
run the following Helm command:
helm install kanister/profile \
--name profile --namespace kanister \
--set defaultProfile=true \
--set s3.bucket="my-kanister-bucket" \
--set s3.region="us-west-2" \
--set s3.accessKey="${AWS_ACCESS_KEY_ID}" \
--set s3.secretKey="${AWS_SECRET_ACCESS_KEY}"
In addition to the Kanister controller and Profile charts, the Kanister Helm repo also contains charts of example stateful applications updated to include Kanister Blueprints. The source for these charts can be found here. These applications can be easily backed-up and restored.
# Install MySQL and configure its Kanister Blueprint.
helm install kanister/kanister-mysql \
--name mysql-release --namespace mysql-ns \
--set kanister.controller_namespace=kanister
To backup this application's data, we create a Kanister ActionSet. The command
to create an ActionSet is rendered in the Helm notes, which can be displayed
with helm status mysql-release
.
$ cat << EOF | kubectl create -f -
apiVersion: cr.kanister.io/v1alpha1
kind: ActionSet
metadata:
generateName: mysql-backup-
namespace: kanister
spec:
actions:
- name: backup
blueprint: mysql-release-kanister-mysql-blueprint
object:
kind: Deployment
name: mysql-release-kanister-mysql
namespace: mysql-ns
EOF
actionset "mysql-backup-qgx06" created
We can now restore this backup by chaining a restore off the ActionSet we just created using kanctl
.
$ kanctl --namespace kanister create actionset --action restore --from mysql-backup-qgx06
actionset restore-mysql-backup-qgx06-bd4mq created
To get a more detailed overview of Kanister's components, let's walk through a non-Helm example of using Kanister to backup and restore MongoDB. In this example, we will deploy MongoDB with a sidecar container. This sidecar container will include the necessary tools to store protected data from MongoDB into an S3 bucket in AWS. Note that a sidecar container is not required to use Kanister, but is just one of several ways to access tools needed to protect the application.
The following command deploys the example MongoDB application in default
namespace:
$ kubectl apply -f ./examples/mongo-sidecar/mongo-cluster.yaml
configmap "mongo-cluster" created
service "mongo-cluster" created
statefulset "mongo-cluster" created
Once MongoDB is running, you can populate it with some data. Let's add a collection called "restaurants" to a test database:
# Connect to MongoDB by running a shell inside MongoDB's pod
$ kubectl exec -i -t mongo-cluster-0 -- bash -l
# From inside the shell, use the mongo CLI to insert some data into the test database
$ mongo test --quiet --eval "db.restaurants.insert({'name' : 'Roys', 'cuisine' : 'Hawaiian', 'id' : '8675309'})"
WriteResult({ "nInserted" : 1 })
# View the restaurants data in the test database
$ mongo test --quiet --eval "db.restaurants.find()"
{ "_id" : ObjectId("5a1dd0719dcbfd513fecf87c"), "name" : "Roys", "cuisine" : "Hawaiian", "id" : "8675309" }
Next create a Blueprint which describes how backup and restore actions can be
executed on this application. The Blueprint for this application can be found at
./examples/mongo-sidecar/blueprint.yaml
. Notice that the backup action of the
Blueprint references the S3 location specified in the ConfigMap in
./examples/mongo-sidecar/s3-location-configmap.yaml
. In order for this example
to work, you should update the path field of s3-location-configmap.yaml to point
to an S3 bucket to which you have access. You should also update secrets.yaml
to include AWS credentials that have read/write access to the S3 bucket. Provide
your AWS credentials by setting the corresponding data values for
aws_access_key_id
and aws_secret_access_key
in secrets.yaml
. These are
encoded using base64. The following commands will create a ConfigMap, Secrets
and a Blueprint in controller's namespace:
# Get base64 encoded aws keys
$ echo "YOUR_KEY" | base64
# Create the ConfigMap with an S3 path
$ kubectl apply -f ./examples/mongo-sidecar/s3-location-configmap.yaml
configmap "mongo-s3-location" created
# Create the secrets with the AWS credentials
$ kubectl apply -f ./examples/mongo-sidecar/secrets.yaml
secrets "aws-creds" created
# Create the Blueprint for MongoDB
$ kubectl apply -f ./examples/mongo-sidecar/blueprint.yaml
blueprint "mongo-sidecar" created
You can now take a backup of MongoDB's data using an ActionSet defining backup for this application. Create an ActionSet in the same namespace as the controller.
$ kubectl --namespace kanister apply -f ./examples/mongo-sidecar/backup-actionset.yaml
actionset "mongo-backup-12046" created
$ kubectl --namespace kanister get actionsets.cr.kanister.io
NAME KIND
mongo-backup-12046 ActionSet.v1alpha1.cr.kanister.io
Let's say someone with fat fingers accidentally deleted the restaurants collection using the following command:
# Drop the restaurants collection
$ mongo test --quiet --eval "db.restaurants.drop()"
true
If you try to access this data in the database, you should see that it is no longer there:
$ mongo test --quiet --eval "db.restaurants.find()"
# No entries should be found in the restaurants collection
To restore the missing data, we want to use the backup created in step 2. An
easy way to do this is to leverage kanctl
, a command-line tool that helps
create ActionSets that depend on other ActionSets:
$ kanctl --namespace kanister create actionset --action restore --from "mongo-backup-12046"
actionset restore-mongo-backup-12046-s1wb7 created
# View the status of the ActionSet
kubectl --namespace kanister get actionset restore-mongo-backup-12046-s1wb7 -oyaml
You should now see that the data has been successfully restored to MongoDB!
$ mongo test --quiet --eval "db.restaurants.find()"
{ "_id" : ObjectId("5a1dd0719dcbfd513fecf87c"), "name" : "Roys", "cuisine" : "Hawaiian", "id" : "8675309" }
The artifacts created by the backup action can be cleaned up using the following command:
$ kanctl --namespace kanister create actionset --action delete --from "mongo-backup-12046"
actionset "delete-mongo-backup-12046-kf8mt" created
# View the status of the ActionSet
$ kubectl --namespace kanister get actionset delete-mongo-backup-12046-kf8mt -oyaml
The Kanister components can be cleaned up with the following commands
$ helm delete --purge myrelease
$ kubectl delete crd {actionsets,blueprints}.cr.kanister.io
$ kubectl --namespace kanister delete actionset --all
For troubleshooting help, you can email the Kanister Google Group, reach out to us on Slack, or file an issue.
Apache License 2.0, see LICENSE.