Kubernetes: How to effectively backup etcd
Introduction
In order to restore cluster successfully from failure, we create scheduled backup job. Kubernetes natively support cronjobs, so we are going to use that feature for our workflow.
Workflow is divided into three stages:
- build backup tool
- deploy cronjob
- build pipeline for configuration management
Lets get started…
Build Docker Image
Create Dockerfile with a following content (see below). Notice “ARG” parameter, this helps us to add version control our builtin tool.
FROM alpine:latest
ARG ETCD_VERSION=v3.4.13
ENV ETCDCTL_ENDPOINTS "https://127.0.0.1:2379"
ENV ETCDCTL_CACERT "/etc/kubernetes/pki/etcd/ca.crt"
ENV ETCDCTL_KEY "/etc/kubernetes/pki/etcd/healthcheck-client.key"
ENV ETCDCTL_CERT "/etc/kubernetes/pki/etcd/healthcheck-client.crt"
RUN apk add --update --no-cache bash ca-certificates tzdata openssl
RUN wget https://github.com/etcd-io/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz \
&& tar xzf etcd-${ETCD_VERSION}-linux-amd64.tar.gz \
&& mv etcd-${ETCD_VERSION}-linux-amd64/etcdctl /usr/local/bin/etcdctl \
&& rm -rf etcd-${ETCD_VERSION}-linux-amd64*
ENTRYPOINT ["/bin/bash"]
Export “ETCD_VERSION” variable, which will be used in next step. Changing tool version can be simply done by changing variable value.
export ETCD_VERSION=v3.4.13
Below command will build our backup image.
docker build --build-arg=$ETCD_VERSION -t etcd-backup:$ETCD_VERSION .
Verify that built image exists!
docker images | grep -i "etcd-backup"
Backup Etcd Datastore Using Kubernetes Cronjob
Now that we have built backup container, lets create scheduled backup cronjob. For demonstration purpose we use “default” namespace.
Create file named etcd-backup-cronjob.yaml and paste below content. Notice last line, where we mount host timezone into our container.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: etcd-backup
spec:
schedule: "*/5 * * * *"
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 5
concurrencyPolicy: Allow
jobTemplate:
spec:
template:
spec:
containers:
- name: etcd-backup
image: etcd-backup:v3.4.13
env:
- name: ETCDCTL_API
value: "3"
- name: ETCDCTL_ENDPOINTS
value: "https://127.0.0.1:2379"
- name: ETCDCTL_CACERT
value: "/etc/kubernetes/pki/etcd/ca.crt"
- name: ETCDCTL_CERT
value: "/etc/kubernetes/pki/etcd/healthcheck-client.crt"
- name: ETCDCTL_KEY
value: "/etc/kubernetes/pki/etcd/healthcheck-client.key"
command: ["/bin/bash","-c"]
args: ["etcdctl snapshot save /data/etcd-backup/etcd-snapshot-$(date +%Y-%m-%dT%H:%M).db"]
volumeMounts:
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
readOnly: true
- mountPath: /data/etcd-backup
name: etcd-backup
- mountPath: /etc/localtime
name: local-timezone
restartPolicy: OnFailure
hostNetwork: true
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
operator: Exists
- key: node.kubernetes.io/memory-pressure
effect: NoSchedule
operator: Exists
volumes:
- name: etcd-certs
hostPath:
path: /etc/kubernetes/pki/etcd
type: Directory
- name: etcd-backup
hostPath:
path: /data/etcd-backup
type: DirectoryOrCreate
- name: local-timezone
hostPath:
path: /usr/share/zoneinfo/Europe/Tallinn # mount host timezone to container
Deploy cronjob to cluster.
kubectl apply -f etcd-backup-cronjob.yaml
Verify that cronjob exits in cluster with below command.
kubectl get cronjobs
Output should look something like below.
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
etcd-backup */5 * * * * False 0 <none> 4s
We scheduled our backup job to run after evrey 5min. After 5min passed we can see that our first scheduled job have successfully completed. To see completed jobs we can use following command “kubectl get jobs”
NAME COMPLETIONS DURATION AGE
etcd-backup-1614619800 1/1 2s 47s
On host machine we can see that our backup file is successfully created into /data/etcd-backup directory.
ls -l /data/etcd-backup/
total 30224
-rw------- 1 root root 30945312 märts 1 19:30 etcd-snapshot-2021-03-01T19:30.db
Congratulations, we have successfully scheduled etcd backup!
Manage Cronjob Throught Gitlab CI/CD
Now that we have successfully created etcd-backup, its time to create Gitlab workflow for configuration management.
Requirements:
- Gitlab up and running
- Gitlab-Runner (docker executor)
In Gitlab create New Group and name it as Kubernetes. Under Kubernetes group create project named etcd-backup.
Create KUBECONFIG variable. Make sure that variable type is “File”. The idea here why to add variable under group, is that future projects can automatically access cluster API.
Add below context to gitlab-ci.yml file. Some lines may be different, depends on your gitlab-runner setup.
Create etcd-backup-cronjob.yaml file. File context is almost same as in previously created file. Two lines are modified:
schedule: $ETCD_CRONJOB_TIME
image: $CI_REGISTRY_IMAGE:$ETCD_VERSION
Congratulations, we have successfully created pipeline for configuration management!
For more complex and advanced pipelines please reference to https://docs.gitlab.com/13.7/ee/ci/yaml/
Cheers :)