This article is a step-by-step guide for protecting stateful applications running on Kubernetes and their data. We will be focussing on containers which use RWX (ReadWriteMany access mode) persistent volumes using Kubera Director's DMaaS feature.
We are using an NFS server to provision an RWX volume. This volume is consumed by a WordPress deployment in a 3-replica HA setup. The NFS server is backed by an OpenEBS volume, however, this guide is compatible with all Kubernetes persistent volumes -- OpenEBS or otherwise.
Click here to read more about about configuring an NFS server provisioner using OpenEBS.
DMaaS (Data Migration as a Service) is used to create backups of stateful (and stateless) Kubernetes applications. These backups can be conveniently restored into a remote Kubernetes cluster. DMaaS is a part of Kubera Director. To learn more about DMaaS, click here.
Prerequisites
You will need a Kubera Standard or Kubera Enterprise subscription plan to use DMaaS. To know more about different Kubera subscription plans, click here.
If you already have a Kubera account, you can check for the Kubera subscription plan you are using at director.mayadata.io/subscriptions.
DMaaS backup and restore requires one 'source cluster' and one 'destination cluster'. Essentially, the same Kubernetes cluster may act as both the source and the destination.
DMaaS backup and restore involves 2 major operations:
- Creating a backup schedule:
You will create a backup schedule for the application. You will have to decide on the frequency of backup operations, the object store you want to use for the backup files (AWS S3/GCP Cloud Storage/MinIO/Cloudian HyperStore), and the number of full backups you want to maintain. - Restoring from a backup schedule:
You will restore a 'Completed' backup linked to your previously-created backup schedule from your object store.
Creating DMaaS schedules
STEP 1
Sign in to Kubera and connect your cluster at director.mayadata.io.
Note:
- If you are sitting behind a proxy server, you can configure Kubera Director for it by using the 'Advanced' option when you are adding a cluster. Click here for detailed instructions.
- If you are using Red Hat OpenShift Container Platform or OpenShift Enterprise, you might have to use the command
oc patch ds/restic --namespace maya-system --type json -p '[{"op":"add","path":"/spec/template/spec/containers/0/securityContext","value": { "privileged": true}}]'
.
STEP 2
Select your cluster from the 'Clusters' option from your Project pane.
STEP 3
Click on the 'Applications' option in your Cluster pane.
STEP 4
Click on the name of the application that you want to create a backup schedule for. In our case, we will go with the 'wordpress-mariadb-master' StatefulSet. We will create schedules for the other deployments/statefulsets in the steps below.
STEP 5
Select the DMaaS tab and click on ‘New schedule’.
STEP 6
In this step, you will create the backup schedule.
- Select your object store of choice -- AWS S3 / GCP Cloud Storage / MinIO / Cloudian HyperStore
- Select the credentials for the object store from the 'Provider credentials' drop-down list.
- Set the backup frequency from the drop-down lists under 'Time Interval'.
- In the 'Backup retention count' box, enter the number of full backups (the remaining backups will be incremental) you would like to maintain.
- Click on 'Schedule now'.
In our case, we have selected AWS S3 object store, a half-hour backup frequency, and have chosen to maintain 5 full backups.
Note: If you don’t have credentials listed in the drop-down, you’ll have to add them. Click on ‘Add Credentials’ to add the credentials for your data store (e.g. add the Access Key ID and the Secret Access Key of your AWS account, if you want to use your S3 bucket as the data store for your DMaaS backups).
STEP 7
Follow steps 3 through 6 to create backup schedules for your remaining applications/micro-services. In our case, we have created schedules for 'wordpress-mariadb-master' StatefulSet, 'wordpress-mariadb-slave' StatefulSet and 'wordpress' Deployment.
You can check get the status of the backup schedules by clicking on the 'DMaaS' option in the Project pane.
Restoring from DMaaS schedules
Before continuing with the restoration, please make sure that your destination cluster satisfies the requirements in the 'Prerequisites for destination cluster' section.
STEP 1
Connect the destination cluster to Kubera Director -- director.mayadata.io
STEP 2
Click on the 'DMaaS' option in the Project pane.
STEP 3
Click on the restore option next to the name of the schedule which you want to restore. In our case, we will go with the 'wordpress-mariadb-master' schedule first.
STEP 4
Select the name of the destination cluster from the drop-down list and click on 'Start restore'. We have selected 'cluster-2-tmva8'.
STEP 5
Click on the 'Restore' link on the pop-up to check the status of the on-going restore.
STEP 6
Follow steps 2 through 5 again to restore other application/micro-services. After restoring from the schedule for 'wordpress-maridb-master', we have restored from the schedules for 'wordpress-mariadb-slave' and 'wordpress'.
Conclusion
We can go to the 'Applications' section in cluster-2 and see the WordPress deployment and the MariaDB statefulsets.
We can verify this through the CLI as well.
niladri@cluster-2-master:~$ kubectl -n wordpress get deployment,statefulset,pod,service,secret,ingress,persistentvolumeclaim -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/wordpress 3/3 3 3 8h wordpress docker.io/bitnami/wordpress:5.4.2-debian-10-r26 app.kubernetes.io/instance=wordpress,app.kubernetes.io/name=wordpress
NAME READY AGE CONTAINERS IMAGES
statefulset.apps/wordpress-mariadb-master 1/1 8h mariadb docker.io/bitnami/mariadb:10.3.23-debian-10-r44
statefulset.apps/wordpress-mariadb-slave 2/2 8h mariadb docker.io/bitnami/mariadb:10.3.23-debian-10-r44
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/wordpress-5f75df99f-d2p76 1/1 Running 3 8h 192.168.1.54 cluster-2-node-1 <none> <none>
pod/wordpress-5f75df99f-s9zf9 1/1 Running 6 8h 192.168.2.59 cluster-2-node-2 <none> <none>
pod/wordpress-5f75df99f-v5f2s 1/1 Running 7 8h 192.168.3.59 cluster-2-node-3 <none> <none>
pod/wordpress-mariadb-master-0 1/1 Running 0 8h 192.168.2.69 cluster-2-node-2 <none> <none>
pod/wordpress-mariadb-slave-0 1/1 Running 0 8h 192.168.1.69 cluster-2-node-1 <none> <none>
pod/wordpress-mariadb-slave-1 1/1 Running 0 8h 192.168.3.68 cluster-2-node-3 <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/wordpress ClusterIP 10.96.72.225 <none> 80/TCP,443/TCP 8h app.kubernetes.io/instance=wordpress,app.kubernetes.io/name=wordpress
service/wordpress-mariadb ClusterIP 10.107.188.170 <none> 3306/TCP 8h app=mariadb,component=master,release=wordpress
service/wordpress-mariadb-slave ClusterIP 10.111.245.140 <none> 3306/TCP 8h app=mariadb,component=slave,release=wordpress
NAME TYPE DATA AGE
secret/default-token-vx9j9 kubernetes.io/service-account-token 3 4d14h
secret/wordpress Opaque 1 8h
secret/wordpress-mariadb Opaque 3 8h
secret/wordpress.mayalabs.io-tls kubernetes.io/tls 2 8h
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress.extensions/wordpress <none> wordpress.mayalabs.io 10.63.10.51,10.63.10.52,10.63.10.53 80, 443 8h
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
persistentvolumeclaim/data-wordpress-mariadb-master-0 Bound pvc-c312202f-e75d-4810-b84f-d3b2f8f93fca 8Gi RWO openebs-hostpath 8h Filesystem
persistentvolumeclaim/data-wordpress-mariadb-slave-0 Bound pvc-52f80751-6b39-4f36-8820-ecb8f6538e2a 8Gi RWO openebs-hostpath 8h Filesystem
persistentvolumeclaim/data-wordpress-mariadb-slave-1 Bound pvc-aacb68cd-a10f-433d-af1c-0101bdb39e7a 8Gi RWO openebs-hostpath 8h Filesystem
persistentvolumeclaim/wordpress-pvc Bound pvc-68985f30-4d43-4dce-90cc-8555886f9f3e 8Gi RWX wp-nfs-sc 8h Filesystem
Also, we can see that the MinIO application in the source cluster's 'bucket' namespace is absent from the destination cluster.
niladri@cluster-2-master:~$ kubectl -n bucket get pods
No resources found in bucket namespace.
niladri@cluster-2-master:~$ kubectl get ns
NAME STATUS AGE
default Active 9d
ingress-nginx Active 5d1h
kube-node-lease Active 9d
kube-public Active 9d
kube-system Active 9d
maya-system Active 9h
metallb-system Active 9d
nfs Active 5d1h
openebs Active 9d
wordpress Active 4d14h
HOSTING BOTH SEPARATELY
The two WordPress services are exposed using the same domain name.
Source cluster:
niladri@cluster-1-master:~$ kubectl get ingress --all-namespaces
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
bucket minio <none> minio.mayalabs.io 10.63.10.55,10.63.10.56,10.63.10.57 80, 443 4d12h
wordpress wordpress <none> wordpress.mayalabs.io 10.63.10.55,10.63.10.56,10.63.10.57 80, 443 5d2h
niladri@cluster-1-master:~$ kubectl get service -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-ingress-controller LoadBalancer 10.109.21.135 10.63.10.58 80:30324/TCP,443:30714/TCP 6d
ingress-nginx-ingress-controller-default-backend ClusterIP 10.111.57.166 <none> 80/TCP 6d
niladri@cluster-1-master:~$ kubectl get secret -n wordpress wordpress.mayalabs.io-tls
NAME TYPE DATA AGE
wordpress.mayalabs.io-tls kubernetes.io/tls 2 5d2h
Destination cluster:
niladri@cluster-2-master:~$ kubectl get ingress --all-namespaces
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
wordpress wordpress <none> wordpress.mayalabs.io 10.63.10.51,10.63.10.52,10.63.10.53 80, 443 9h
niladri@cluster-2-master:~$ kubectl get service -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-ingress-controller LoadBalancer 10.107.127.98 10.63.10.59 80:31248/TCP,443:32178/TCP 5d1h
ingress-nginx-ingress-controller-default-backend ClusterIP 10.103.103.84 <none> 80/TCP 5d1h
niladri@cluster-2-master:~$ kubectl get secret -n wordpress wordpress.mayalabs.io-tls
NAME TYPE DATA AGE
wordpress.mayalabs.io-tls kubernetes.io/tls 2 10h
If the services are hosted in isolated networks (or they use different DNS), then the WordPress webpages in the two clusters will be accessible separately within their separate networks. No changes are required for this use-case.
If both the services are in the same network (or using the the same DNS), you will have to assign a unique domain name to either of these, for them to be separately reachable. In our use-case, both of our clusters are inside the same subnet and we are using the same DNS.
To mitigate this, we will use a different domain name (https://wordpress1.mayalabs.io) for the destination cluster's service. We will also add a DNS entry to our nameserver for this newly created domain name. To change the domain name, we will edit the manifest YAML of the 'wordpress' ingress (this is being done in the destination cluster).
kubectl -n wordpress edit ingress wordpress
Initial:
spec:
rules:
- host: wordpress.mayalabs.io
http:
paths:
- backend:
serviceName: wordpress
servicePort: http
path: /
pathType: ImplementationSpecific
tls:
- hosts:
- wordpress.mayalabs.io
secretName: wordpress.mayalabs.io-tls
Final:
spec:
rules:
- host: wordpress1.mayalabs.io
http:
paths:
- backend:
serviceName: wordpress
servicePort: http
path: /
pathType: ImplementationSpecific
tls:
- hosts:
- wordpress1.mayalabs.io
secretName: wordpress1.mayalabs.io-tls
We are using TLS encryption for the webpage. Our present TLS certificate is generated for the previous domain name (wordpress.mayalabs.io). So, we will delete the previous TLS secret...
kubectl -n wordpress delete secret wordpress.mayalabs.io-tls
... and create the new one using the new certificate (wordpress1.crt) and private key (wordpress1.key).
kubectl create secret tls wordpress1.mayalabs.io-tls --cert=wordpress1.crt --key=wordpress1.key --namespace=wordpress
After this, we can verify that the service is available at https://wordpress1.mayalabs.io (destination cluster).
Let's add a post to the source cluster webpage...
As you can see, the service on the destination cluster is unaffected!