Problem Statement
User is unable to create cStor storage pools using Local SSDs in GKE.
Kubernetes and OpenEBS version
Any Kubernetes version which supports Local SSDs and OpenEBS version 0.8.1 and above.
OpenEBS Storage Engine
cStor
Possible Reason
On GKE, the Local SSDs are formatted with ext4
and mounted under /mnt/disks/
. NDM can detect Local SSDs attached to Kubernetes Nodes and can create blockdevice CR, but cStor pool will not create on such disks.
Symptoms
Following are the logs from one of the cStor pool pod.
E0714 17:51:57.626203 7 pool.go:66] Unable to import pool: exit status 1, cannot import 'cstor-8435217d-a65e-11e9-a70a-42010a800025': no such pool available
E0714 17:51:57.636423 7 pool.go:66] Unable to import pool: exit status 1, cannot import 'cstor-8435217d-a65e-11e9-a70a-42010a800025': no such pool available
I0714 17:51:57.636446 7 handler.go:393] cstor pool 'cstor-ssd-pool-g6nu': uid '8435217d-a65e-11e9-a70a-42010a800025': phase 'Pending': is_empty_status: false
I0714 17:51:57.636456 7 handler.go:400] cStorPool pending: 8435217d-a65e-11e9-a70a-42010a800025
E0714 17:51:57.641670 7 pool.go:350] Unable to clear label on blockdevice /dev/disk/by-id/scsi-0Google_EphemeralDisk_local-ssd-0: failed to check state for /dev/disk/by-
id/scsi-0Google_EphemeralDisk_local-ssd-0
, err = exit status 1
E0714 17:51:57.641699 7 handler.go:242] Unable to clear labels from all the blockdevices of the pool%!(EXTRA types.UID=8435217d-a65e-11e9-a70a-42010a800025)
E0714 17:51:57.646517 7 pool.go:93] Unable to create pool: /dev/disk/by-id/scsi-0Google_EphemeralDisk_local-ssd-0 is in use and contains a ext4 filesystem.
E0714 17:51:57.646560 7 handler.go:250] Pool creation failure: 8435217d-a65e-11e9-a70a-42010a800025
E0714 17:51:57.646600 7 handler.go:74] exit status 1
I0714 17:51:57.646743 7 event.go:221] Event(v1.ObjectReference{Kind:"CStorPool", Namespace:"", Name:"cstor-ssd-pool-g6nu", UID:"8435217d-a65e-11e9-a70a-42010a800025", AP
IVersion:"openebs.io/v1alpha1", ResourceVersion:"43175", FieldPath:""}): type: 'Warning' reason: 'FailCreate' Resource creation failed
I0714 17:51:57.667714 7 handler.go:79] cStorPool:cstor-ssd-pool-g6nu, 8435217d-a65e-11e9-a70a-42010a800025; Status: Offline
E0714 17:51:58.416014 7 controller.go:84] Zrepl/Pool is not available, Shutting down4 17:51:57.603930 7 handler.go:108] Processing cStorPool added event: cstor-ssd
SPC is created but pool is in offline state. Output of kubectl get csp
will be similar to the following.
NAME ALLOCATED FREE CAPACITY STATUS TYPE AGE
cstor-disk-pool-5nfr Offline striped 10m
Solution
cStor pool will be created on an unclaimed BD which does not contain any filesystem and should not be mounted on the Node. Wipe out the disks on Node using the following steps.
- Perform the following steps on Node
-
Perform
lsblk
command. The output will be similar to the following.sda 8:0 0 40G 0 disk ├─sda1 8:1 0 39.9G 0 part / ├─sda14 8:14 0 4M 0 part └─sda15 8:15 0 106M 0 part /boot/efi sdb 8:16 0 375G 0 disk /mnt/disks/ssd0
-
Unmount the disk using the following command. Following is an example command. In your case, change the mount path with an actual one.
sudo umount /mnt/disks/ssd0
-
Wipe out the filesystem using the following command:
sudo wipefs -af <device_path_on_node>
Example command:
sudo wipefs -af /dev/sdb
-
Verify if filesystem does not present on the disk using the following command.
fdisk /dev/sdb
Example output snippet:
Welcome to fdisk (util-linux 2.27.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table.
This means the disk does not contain any filesystem.
If only one disk is used from this node in the StoragePoolClaim YAML spec, then the pool will be created on the Node automatically after performing the above operations using this disk.
- Perform following operations on you K8s master node
-
Identify the NDM pods running on node where Local SSDs are connected. The following command will help to get this information.
kubectl get pod -n <openebs_installaed_namespace> -o wide
-
Delete NDM pod scheduled on nodes where Local SSDs are attached. This can be done by using the following command.
kubectl delete pod <NDM_pod_name> -n <openebs_installed_namespace>
-
Repeat the above procedure for all other NDM pods one by one. This will update the corresponding blockdevice CR attached on the Nodes with updated the filesystem type and mount path information.