Kubernetes: How to effectively backup etcd

LinuxSkillShare
4 min readMar 1, 2021

Introduction

In order to restore cluster successfully from failure, we create scheduled backup job. Kubernetes natively support cronjobs, so we are going to use that feature for our workflow.

Workflow is divided into three stages:

  1. build backup tool
  2. deploy cronjob
  3. build pipeline for configuration management

Lets get started…

Build Docker Image

Create Dockerfile with a following content (see below). Notice “ARG” parameter, this helps us to add version control our builtin tool.

FROM alpine:latest

ARG ETCD_VERSION=v3.4.13

ENV ETCDCTL_ENDPOINTS "https://127.0.0.1:2379"
ENV ETCDCTL_CACERT "/etc/kubernetes/pki/etcd/ca.crt"
ENV ETCDCTL_KEY "/etc/kubernetes/pki/etcd/healthcheck-client.key"
ENV ETCDCTL_CERT "/etc/kubernetes/pki/etcd/healthcheck-client.crt"

RUN apk add --update --no-cache bash ca-certificates tzdata openssl

RUN wget https://github.com/etcd-io/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz \
&& tar xzf etcd-${ETCD_VERSION}-linux-amd64.tar.gz \
&& mv etcd-${ETCD_VERSION}-linux-amd64/etcdctl /usr/local/bin/etcdctl \
&& rm -rf etcd-${ETCD_VERSION}-linux-amd64*

ENTRYPOINT ["/bin/bash"]

Export “ETCD_VERSION” variable, which will be used in next step. Changing tool version can be simply done by changing variable value.

export ETCD_VERSION=v3.4.13

Below command will build our backup image.

docker build --build-arg=$ETCD_VERSION -t etcd-backup:$ETCD_VERSION .

Verify that built image exists!

docker images | grep -i "etcd-backup"

Backup Etcd Datastore Using Kubernetes Cronjob

Now that we have built backup container, lets create scheduled backup cronjob. For demonstration purpose we use “default” namespace.

Create file named etcd-backup-cronjob.yaml and paste below content. Notice last line, where we mount host timezone into our container.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: etcd-backup
spec:
schedule: "*/5 * * * *"
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 5
concurrencyPolicy: Allow
jobTemplate:
spec:
template:
spec:
containers:
- name: etcd-backup
image: etcd-backup:v3.4.13
env:
- name: ETCDCTL_API
value: "3"
- name: ETCDCTL_ENDPOINTS
value: "https://127.0.0.1:2379"
- name: ETCDCTL_CACERT
value: "/etc/kubernetes/pki/etcd/ca.crt"
- name: ETCDCTL_CERT
value: "/etc/kubernetes/pki/etcd/healthcheck-client.crt"
- name: ETCDCTL_KEY
value: "/etc/kubernetes/pki/etcd/healthcheck-client.key"
command: ["/bin/bash","-c"]
args: ["etcdctl snapshot save /data/etcd-backup/etcd-snapshot-$(date +%Y-%m-%dT%H:%M).db"]
volumeMounts:
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
readOnly: true
- mountPath: /data/etcd-backup
name: etcd-backup
- mountPath: /etc/localtime
name: local-timezone
restartPolicy: OnFailure
hostNetwork: true
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
operator: Exists
- key: node.kubernetes.io/memory-pressure
effect: NoSchedule
operator: Exists
volumes:
- name: etcd-certs
hostPath:
path: /etc/kubernetes/pki/etcd
type: Directory
- name: etcd-backup
hostPath:
path: /data/etcd-backup
type: DirectoryOrCreate
- name: local-timezone
hostPath:
path: /usr/share/zoneinfo/Europe/Tallinn # mount host timezone to container

Deploy cronjob to cluster.

kubectl apply -f etcd-backup-cronjob.yaml

Verify that cronjob exits in cluster with below command.

kubectl get cronjobs

Output should look something like below.

NAME          SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
etcd-backup */5 * * * * False 0 <none> 4s

We scheduled our backup job to run after evrey 5min. After 5min passed we can see that our first scheduled job have successfully completed. To see completed jobs we can use following command “kubectl get jobs

NAME                     COMPLETIONS   DURATION   AGE
etcd-backup-1614619800 1/1 2s 47s

On host machine we can see that our backup file is successfully created into /data/etcd-backup directory.

ls -l /data/etcd-backup/
total 30224
-rw------- 1 root root 30945312 märts 1 19:30 etcd-snapshot-2021-03-01T19:30.db

Congratulations, we have successfully scheduled etcd backup!

Manage Cronjob Throught Gitlab CI/CD

Now that we have successfully created etcd-backup, its time to create Gitlab workflow for configuration management.

Requirements:

  1. Gitlab up and running
  2. Gitlab-Runner (docker executor)

In Gitlab create New Group and name it as Kubernetes. Under Kubernetes group create project named etcd-backup.

Create KUBECONFIG variable. Make sure that variable type is “File”. The idea here why to add variable under group, is that future projects can automatically access cluster API.

Add below context to gitlab-ci.yml file. Some lines may be different, depends on your gitlab-runner setup.

Create etcd-backup-cronjob.yaml file. File context is almost same as in previously created file. Two lines are modified:

schedule: $ETCD_CRONJOB_TIME

image: $CI_REGISTRY_IMAGE:$ETCD_VERSION

Congratulations, we have successfully created pipeline for configuration management!

For more complex and advanced pipelines please reference to https://docs.gitlab.com/13.7/ee/ci/yaml/

Cheers :)

--

--

LinuxSkillShare

Certified Kubernetes Administrator (CKA) & Linux System Administrator & CI/CD Integrations & Big Data: Apache Hadoop & Automating: Ansible, Bash, Python