Skip to content

Commit

Permalink
Improve CockroachDB example
Browse files Browse the repository at this point in the history
* Use an init container to eliminate potential edge case where losing
  the first pet's could cause it to start a second logical cluster
* Exec the cockroach binary so that it runs as PID 1 in the container
* Make some small improvements to the README
  • Loading branch information
a-robinson committed Oct 31, 2016
1 parent e6b2517 commit 6b98de3
Show file tree
Hide file tree
Showing 3 changed files with 108 additions and 39 deletions.
54 changes: 41 additions & 13 deletions examples/cockroachdb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,11 @@ a PetSet. CockroachDB is a distributed, scalable NewSQL database. Please see
Standard PetSet limitations apply: There is currently no possibility to use
node-local storage (outside of single-node tests), and so there is likely
a performance hit associated with running CockroachDB on some external storage.
Note that CockroachDB already does replication and thus should not be deployed on
a persistent volume which already replicates internally.
High-performance use cases on a private Kubernetes cluster should consider
a DaemonSet deployment.
Note that CockroachDB already does replication and thus it is unnecessary to
deploy it onto persistent volumes which already replicate internally.
For this reason, high-performance use cases on a private Kubernetes cluster
may want to consider a DaemonSet deployment until PetSets support node-local
storage (see #7562).

### Recovery after persistent storage failure

Expand All @@ -27,17 +28,25 @@ first node is special in that the administrator must manually prepopulate the
parameter. If this is not done, the first node will bootstrap a new cluster,
which will lead to a lot of trouble.

### Dynamic provisioning
### Dynamic volume provisioning

The deployment is written for a use case in which dynamic provisioning is
The deployment is written for a use case in which dynamic volume provisioning is
available. When that is not the case, the persistent volume claims need
to be created manually. See [minikube.sh](minikube.sh) for the necessary
steps.
steps. If you're on GCE or AWS, where dynamic provisioning is supported, no
manual work is needed to create the persistent volumes.

## Testing locally on minikube

Follow the steps in [minikube.sh](minikube.sh) (or simply run that file).

## Testing in the cloud on GCE or AWS

Once you have a Kubernetes cluster running, just run
`kubectl create -f cockroachdb-petset.yaml` to create your cockroachdb cluster.
This works because GCE and AWS support dynamic volume provisioning by default,
so persistent volumes will be created for the CockroachDB pods as needed.

## Accessing the database

Along with our PetSet configuration, we expose a standard Kubernetes service
Expand All @@ -48,15 +57,27 @@ Start up a client pod and open up an interactive, (mostly) Postgres-flavor
SQL shell using:

```console
$ kubectl run -it cockroach-client --image=cockroachdb/cockroach --restart=Never --command -- bash
root@cockroach-client # ./cockroach sql --host cockroachdb-public
$ kubectl run -it --rm cockroach-client --image=cockroachdb/cockroach --restart=Never --command -- ./cockroach sql --host cockroachdb-public
```

You can see example SQL statements for inserting and querying data in the
included [demo script](demo.sh), but can use almost any Postgres-style SQL
commands. Some more basic examples can be found within
[CockroachDB's documentation](https://www.cockroachlabs.com/docs/learn-cockroachdb-sql.html).

## Accessing the admin UI

If you want to see information about how the cluster is doing, you can try
pulling up the CockroachDB admin UI by port-forwarding from your local machine
to one of the pods:

```shell
kubectl port-forward cockroachdb-0 8080
```

Once you’ve done that, you should be able to access the admin UI by visiting
http://localhost:8080/ in your web browser.

## Simulating failures

When all (or enough) nodes are up, simulate a failure like this:
Expand All @@ -77,10 +98,17 @@ database and ensuring the other replicas have all data that was written.

## Scaling up or down

Simply edit the PetSet (but note that you may need to create a new persistent
volume claim first). If you ran `minikube.sh`, there's a spare volume so you
can immediately scale up by one. Convince yourself that the new node
immediately serves reads and writes.
Simply patch the PetSet by running

```shell
kubectl patch petset cockroachdb -p '{"spec":{"replicas":4}}'
```

Note that you may need to create a new persistent volume claim first. If you
ran `minikube.sh`, there's a spare volume so you can immediately scale up by
one. If you're running on GCE or AWS, you can scale up by as many as you want
because new volumes will automatically be created for you. Convince yourself
that the new node immediately serves reads and writes.

## Cleaning up when you're done

Expand Down
91 changes: 66 additions & 25 deletions examples/cockroachdb/cockroachdb-petset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,25 @@ spec:
apiVersion: v1
kind: Service
metadata:
# This service only exists to create DNS entries for each pet in the petset
# such that they can resolve each other's IP addresses. It does not create a
# load-balanced ClusterIP and should not be used directly by clients in most
# circumstances.
name: cockroachdb
labels:
app: cockroachdb
annotations:
# This is needed to make the peer-finder work properly and to help avoid
# edge cases where instance 0 comes up after losing its data and needs to
# decide whether it should create a new cluster or try to join an existing
# one. If it creates a new cluster when it should have joined an existing
# one, we'd end up with two separate clusters listening at the same service
# endpoint, which would be very bad.
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
# Enable automatic monitoring of all instances when Prometheus is running in the cluster.
prometheus.io/scrape: "true"
prometheus.io/path: "_status/vars"
prometheus.io/port: "8080"
# This service only exists to create DNS entries for each pet in the petset such that they can resolve
# each other's IP addresses. It does not create a load-balanced ClusterIP and should not be used
# directly by clients in most circumstances.
name: cockroachdb
labels:
app: cockroachdb
spec:
ports:
- port: 26257
Expand All @@ -52,13 +60,50 @@ metadata:
name: cockroachdb
spec:
serviceName: "cockroachdb"
replicas: 5
replicas: 3
template:
metadata:
labels:
app: cockroachdb
annotations:
pod.alpha.kubernetes.io/initialized: "true"
# Init containers are run only once in the lifetime of a pod, before
# it's started up for the first time. It has to exit successfully
# before the pod's main containers are allowed to start.
# This particular init container does a DNS lookup for other pods in
# the petset to help determine whether or not a cluster already exists.
# If any other pets exist, it creates a file in the cockroach-data
# directory to pass that information along to the primary container that
# has to decide what command-line flags to use when starting CockroachDB.
# This only matters when a pod's persistent volume is empty - if it has
# data from a previous execution, that data will always be used.
pod.alpha.kubernetes.io/init-containers: '[
{
"name": "bootstrap",
"image": "cockroachdb/cockroach-k8s-init:0.1",
"args": [
"-on-start=/on-start.sh",
"-service=cockroachdb"
],
"env": [
{
"name": "POD_NAMESPACE",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.namespace"
}
}
}
],
"volumeMounts": [
{
"name": "datadir",
"mountPath": "/cockroach/cockroach-data"
}
]
}
]'
spec:
containers:
- name: cockroachdb
Expand Down Expand Up @@ -93,27 +138,23 @@ spec:
- |
# The use of qualified `hostname -f` is crucial:
# Other nodes aren't able to look up the unqualified hostname.
CRARGS=("start" "--logtostderr" "--insecure" "--host" "$(hostname -f)")
# TODO(tschottdorf): really want to use an init container to do
# the bootstrapping. The idea is that the container would know
# whether it's on the first node and could check whether there's
# already a data directory. If not, it would bootstrap the cluster.
# We will need some version of `cockroach init` back for this to
# work. For now, just do the same in a shell snippet.
# Of course this isn't without danger - if node0 loses its data,
# upon restarting it will simply bootstrap a new cluster and smack
# it into our existing cluster.
# There are likely ways out. For example, the init container could
# query the kubernetes API and see whether any other nodes are
# around, etc. Or, of course, the admin can pre-seed the lost
# volume somehow (and in that case we should provide a better way,
# for example a marker file).
CRARGS=("start" "--logtostderr" "--insecure" "--host" "$(hostname -f)" "--http-host" "0.0.0.0")
# We only want to initialize a new cluster (by omitting the join flag)
# if we're sure that we're the first node (i.e. index 0) and that
# there aren't any other nodes running as part of the cluster that
# this is supposed to be a part of (which indicates that a cluster
# already exists and we should make sure not to create a new one).
# It's fine to run without --join on a restart if there aren't any
# other nodes.
if [ ! "$(hostname)" == "cockroachdb-0" ] || \
[ -e "/cockroach/cockroach-data/COCKROACHDB_VERSION" ]
[ -e "/cockroach/cockroach-data/cluster_exists_marker" ]
then
CRARGS+=("--join" "cockroachdb")
# We don't join cockroachdb in order to avoid a node attempting
# to join itself, which currently doesn't work
# (https://github.com/cockroachdb/cockroach/issues/9625).
CRARGS+=("--join" "cockroachdb-public")
fi
/cockroach/cockroach ${CRARGS[*]}
exec /cockroach/cockroach ${CRARGS[*]}
# No pre-stop hook is required, a SIGTERM plus some time is all that's
# needed for graceful shutdown of a node.
terminationGracePeriodSeconds: 60
Expand Down
2 changes: 1 addition & 1 deletion examples/cockroachdb/minikube.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ kubectl delete petsets,pods,persistentvolumes,persistentvolumeclaims,services -l
# claims here manually even though that sounds counter-intuitive. For details
# see https://github.com/kubernetes/contrib/pull/1295#issuecomment-230180894.
# Note that we make an extra volume here so you can manually test scale-up.
for i in $(seq 0 5); do
for i in $(seq 0 3); do
cat <<EOF | kubectl create -f -
kind: PersistentVolume
apiVersion: v1
Expand Down

0 comments on commit 6b98de3

Please sign in to comment.