This article is a step-by-step guide for protecting stateful applications running on Kubernetes and their data. We will be focussing on containers which use RWX (ReadWriteMany access mode) persistent volumes using Kubera Director's DMaaS feature.
We are using an NFS server to provision an RWX volume. This volume is consumed by a WordPress deployment in a 3-replica HA setup. The NFS server is backed by an OpenEBS volume, however, this guide is compatible with all Kubernetes persistent volumes -- OpenEBS or otherwise.
Click here to read more about about configuring an NFS server provisioner using OpenEBS.
DMaaS (Data Migration as a Service) is used to create backups of stateful (and stateless) Kubernetes applications. These backups can be conveniently restored into a remote Kubernetes cluster. DMaaS is a part of Kubera Director. To learn more about DMaaS, click here.
You will need a Kubera Standard or Kubera Enterprise subscription plan to use DMaaS. To know more about different Kubera subscription plans, click here.
If you already have a Kubera account, you can check for the Kubera subscription plan you are using at director.mayadata.io/subscriptions.
DMaaS backup and restore requires one 'source cluster' and one 'destination cluster'. Essentially, the same Kubernetes cluster may act as both the source and the destination.
DMaaS backup and restore involves 2 major operations:
- Creating a backup schedule:
You will create a backup schedule for the application. You will have to decide on the frequency of backup operations, the object store you want to use for the backup files (AWS S3/GCP Cloud Storage/MinIO/Cloudian HyperStore), and the number of full backups you want to maintain.
- Restoring from a backup schedule:
You will restore a 'Completed' backup linked to your previously-created backup schedule from your object store.
Creating DMaaS schedules
Sign in to Kubera and connect your cluster at director.mayadata.io.
- If you are sitting behind a proxy server, you can configure Kubera Director for it by using the 'Advanced' option when you are adding a cluster. Click here for detailed instructions.
- If you are using Red Hat OpenShift Container Platform or OpenShift Enterprise, you might have to add the cstorpoolauto service account to the 'privileged' SCC. Use the command
oc adm policy add-scc-to-user privileged system:serviceaccount:maya-system:cstorpoolauto.
Select your cluster from the 'Clusters' option from your Project pane.
Click on the 'Applications' option in your Cluster pane.
Click on the name of the application that you want to create a backup schedule for. In our case, we will go with the 'wordpress-mariadb-master' StatefulSet. We will create schedules for the other deployments/statefulsets in the steps below.
Select the DMaaS tab and click on ‘New schedule’.
In this step, you will create the backup schedule.
- Select your object store of choice -- AWS S3 / GCP Cloud Storage / MinIO / Cloudian HyperStore
- Select the credentials for the object store from the 'Provider credentials' drop-down list.
- Set the backup frequency from the drop-down lists under 'Time Interval'.
- In the 'Backup retention count' box, enter the number of full backups (the remaining backups will be incremental) you would like to maintain.
- Click on 'Schedule now'.
In our case, we have selected AWS S3 object store, a half-hour backup frequency, and have chosen to maintain 5 full backups.
Note: If you don’t have credentials listed in the drop-down, you’ll have to add them. Click on ‘Add Credentials’ to add the credentials for your data store (e.g. add the Access Key ID and the Secret Access Key of your AWS account, if you want to use your S3 bucket as the data store for your DMaaS backups).
Follow steps 3 through 6 to create backup schedules for your remaining applications/micro-services. In our case, we have created schedules for 'wordpress-mariadb-master' StatefulSet, 'wordpress-mariadb-slave' StatefulSet and 'wordpress' Deployment.
You can check get the status of the backup schedules by clicking on the 'DMaaS' option in the Project pane.
Restoring from DMaaS schedules
Before continuing with the restoration, please make sure that your destination cluster satisfies the requirements in the 'Prerequisites for destination cluster' section.
Connect the destination cluster to Kubera Director -- director.mayadata.io
Click on the 'DMaaS' option in the Project pane.
Click on the restore option next to the name of the schedule which you want to restore. In our case, we will go with the 'wordpress-mariadb-master' schedule first.
Select the name of the destination cluster from the drop-down list and click on 'Start restore'. We have selected 'cluster-2-tmva8'.
Click on the 'Restore' link on the pop-up to check the status of the on-going restore.
Follow steps 2 through 5 again to restore other application/micro-services. After restoring from the schedule for 'wordpress-maridb-master', we have restored from the schedules for 'wordpress-mariadb-slave' and 'wordpress'.
We can go to the 'Applications' section in cluster-2 and see the WordPress deployment and the MariaDB statefulsets.
We can verify this through the CLI as well.
niladri@cluster-2-master:~$ kubectl -n wordpress get deployment,statefulset,pod,service,secret,ingress,persistentvolumeclaim -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/wordpress 3/3 3 3 8h wordpress docker.io/bitnami/wordpress:5.4.2-debian-10-r26 app.kubernetes.io/instance=wordpress,app.kubernetes.io/name=wordpress NAME READY AGE CONTAINERS IMAGES statefulset.apps/wordpress-mariadb-master 1/1 8h mariadb docker.io/bitnami/mariadb:10.3.23-debian-10-r44 statefulset.apps/wordpress-mariadb-slave 2/2 8h mariadb docker.io/bitnami/mariadb:10.3.23-debian-10-r44 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/wordpress-5f75df99f-d2p76 1/1 Running 3 8h 192.168.1.54 cluster-2-node-1 <none> <none> pod/wordpress-5f75df99f-s9zf9 1/1 Running 6 8h 192.168.2.59 cluster-2-node-2 <none> <none> pod/wordpress-5f75df99f-v5f2s 1/1 Running 7 8h 192.168.3.59 cluster-2-node-3 <none> <none> pod/wordpress-mariadb-master-0 1/1 Running 0 8h 192.168.2.69 cluster-2-node-2 <none> <none> pod/wordpress-mariadb-slave-0 1/1 Running 0 8h 192.168.1.69 cluster-2-node-1 <none> <none> pod/wordpress-mariadb-slave-1 1/1 Running 0 8h 192.168.3.68 cluster-2-node-3 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/wordpress ClusterIP 10.96.72.225 <none> 80/TCP,443/TCP 8h app.kubernetes.io/instance=wordpress,app.kubernetes.io/name=wordpress service/wordpress-mariadb ClusterIP 10.107.188.170 <none> 3306/TCP 8h app=mariadb,component=master,release=wordpress service/wordpress-mariadb-slave ClusterIP 10.111.245.140 <none> 3306/TCP 8h app=mariadb,component=slave,release=wordpress NAME TYPE DATA AGE secret/default-token-vx9j9 kubernetes.io/service-account-token 3 4d14h secret/wordpress Opaque 1 8h secret/wordpress-mariadb Opaque 3 8h secret/wordpress.mayalabs.io-tls kubernetes.io/tls 2 8h NAME CLASS HOSTS ADDRESS PORTS AGE ingress.extensions/wordpress <none> wordpress.mayalabs.io 10.63.10.51,10.63.10.52,10.63.10.53 80, 443 8h NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE persistentvolumeclaim/data-wordpress-mariadb-master-0 Bound pvc-c312202f-e75d-4810-b84f-d3b2f8f93fca 8Gi RWO openebs-hostpath 8h Filesystem persistentvolumeclaim/data-wordpress-mariadb-slave-0 Bound pvc-52f80751-6b39-4f36-8820-ecb8f6538e2a 8Gi RWO openebs-hostpath 8h Filesystem persistentvolumeclaim/data-wordpress-mariadb-slave-1 Bound pvc-aacb68cd-a10f-433d-af1c-0101bdb39e7a 8Gi RWO openebs-hostpath 8h Filesystem persistentvolumeclaim/wordpress-pvc Bound pvc-68985f30-4d43-4dce-90cc-8555886f9f3e 8Gi RWX wp-nfs-sc 8h Filesystem
Also, we can see that the MinIO application in the source cluster's 'bucket' namespace is absent from the destination cluster.
niladri@cluster-2-master:~$ kubectl -n bucket get pods No resources found in bucket namespace. niladri@cluster-2-master:~$ kubectl get ns NAME STATUS AGE default Active 9d ingress-nginx Active 5d1h kube-node-lease Active 9d kube-public Active 9d kube-system Active 9d maya-system Active 9h metallb-system Active 9d nfs Active 5d1h openebs Active 9d wordpress Active 4d14h
HOSTING BOTH SEPARATELY
The two WordPress services are exposed using the same domain name.
niladri@cluster-1-master:~$ kubectl get ingress --all-namespaces NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE bucket minio <none> minio.mayalabs.io 10.63.10.55,10.63.10.56,10.63.10.57 80, 443 4d12h wordpress wordpress <none> wordpress.mayalabs.io 10.63.10.55,10.63.10.56,10.63.10.57 80, 443 5d2h niladri@cluster-1-master:~$ kubectl get service -n ingress-nginx NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ingress-nginx-ingress-controller LoadBalancer 10.109.21.135 10.63.10.58 80:30324/TCP,443:30714/TCP 6d ingress-nginx-ingress-controller-default-backend ClusterIP 10.111.57.166 <none> 80/TCP 6d niladri@cluster-1-master:~$ kubectl get secret -n wordpress wordpress.mayalabs.io-tls NAME TYPE DATA AGE wordpress.mayalabs.io-tls kubernetes.io/tls 2 5d2h
niladri@cluster-2-master:~$ kubectl get ingress --all-namespaces NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE wordpress wordpress <none> wordpress.mayalabs.io 10.63.10.51,10.63.10.52,10.63.10.53 80, 443 9h niladri@cluster-2-master:~$ kubectl get service -n ingress-nginx NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ingress-nginx-ingress-controller LoadBalancer 10.107.127.98 10.63.10.59 80:31248/TCP,443:32178/TCP 5d1h ingress-nginx-ingress-controller-default-backend ClusterIP 10.103.103.84 <none> 80/TCP 5d1h niladri@cluster-2-master:~$ kubectl get secret -n wordpress wordpress.mayalabs.io-tls NAME TYPE DATA AGE wordpress.mayalabs.io-tls kubernetes.io/tls 2 10h
If the services are hosted in isolated networks (or they use different DNS), then the WordPress webpages in the two clusters will be accessible separately within their separate networks. No changes are required for this use-case.
If both the services are in the same network (or using the the same DNS), you will have to assign a unique domain name to either of these, for them to be separately reachable. In our use-case, both of our clusters are inside the same subnet and we are using the same DNS.
To mitigate this, we will use a different domain name (https://wordpress1.mayalabs.io) for the destination cluster's service. We will also add a DNS entry to our nameserver for this newly created domain name. To change the domain name, we will edit the manifest YAML of the 'wordpress' ingress (this is being done in the destination cluster).
kubectl -n wordpress edit ingress wordpress
spec: rules: - host: wordpress.mayalabs.io http: paths: - backend: serviceName: wordpress servicePort: http path: / pathType: ImplementationSpecific tls: - hosts: - wordpress.mayalabs.io secretName: wordpress.mayalabs.io-tls
spec: rules: - host: wordpress1.mayalabs.io http: paths: - backend: serviceName: wordpress servicePort: http path: / pathType: ImplementationSpecific tls: - hosts: - wordpress1.mayalabs.io secretName: wordpress1.mayalabs.io-tls
We are using TLS encryption for the webpage. Our present TLS certificate is generated for the previous domain name (wordpress.mayalabs.io). So, we will delete the previous TLS secret...
kubectl -n wordpress delete secret wordpress.mayalabs.io-tls
... and create the new one using the new certificate (wordpress1.crt) and private key (wordpress1.key).
kubectl create secret tls wordpress1.mayalabs.io-tls --cert=wordpress1.crt --key=wordpress1.key --namespace=wordpress
After this, we can verify that the service is available at https://wordpress1.mayalabs.io (destination cluster).
Let's add a post to the source cluster webpage...
As you can see, the service on the destination cluster is unaffected!