In Kubernetes and in docker in general, there are several type of storage, but let's focus on the basics here:
To better illustrate this, check the following pictures:
Now, let's discuss a little bit about the components here. We will start from left to right.
First you have the actual storage, that storage can be cloud (AWS, Google, IBM Cloud etc) or Local (iSCSI, NAS, SAN, bare metal, etc). Kubernetes has a lot of plugins to provide access to that storage. On the other side, that storage can be replicated, hashed, mapped, encrypted, in RAID and so on and so on. From Kubernetes point of view, it is seen simply as a storage thanks to the plugin
Plugins provide the entry point for Kubernetes, as described above, there are different plugins, like EBS Plugin for Cloud Volume or local for local storage. You can check more about Kubernetes Plugins here.
Persistent Volume represent the volumes in Kubernetes term, in other words, the physical representation of the volume gets translated to persistent volume in kubernetes terms. To use that volume, of course we have to use the next component, which is:
Both, PV and PVC are first class objects in Kubernetes, which means, that just like we can GET and DESCRIBE Pods, we can do the same for PV/PVC. It is important to note, that once a PVC has connected to PV, other PVC cannot connect to the same PV.
Lastly, we have the Volumes, which are the PV in the POD. Once the Volume is linked with the PVC which is linked with the PV. That PV can be used only by the containers which are in that pod. Other pods or other containers outside of the pod, cannot use that storage.
So, in a nutshell:
There is a BIG issue with that kinda provisioning, IT DOESN'T SCALE. because of that, we have two type of privisioning:
So let's get going and see how it is done with each type:
Let's configure Static storage.
Firstly, as we said, we have to configure, the PV, let's take one example of a PV:
PV Specs
kind: PersistentVolume apiVersion: v1 <- It is first Class object, just like Pod, Deployment and so on metadata: name: ps-pv <- The name is completely arbitrary, so choose whatever, but with reason :) labels: type: local <- Local Label spec: storageClassName: ps-fast <- Indicate that it is in "ps-fast" storage Class capacity: storage: 50Gi <- We allocate 50 GBs persistentVolumeReclaimPolicy: Retain # ^ # | # What to happen after you remove a PVC? #Retain - Keep it in "protected" mode #Delete(Default) - This will DELETE the PV after removing the PVC # accessModes: - ReadWriteOnce # ^ # | #There are 3 access modes: #ReadWriteOnce - The PV can be taken in RW once by one pod #ReadWriteMany - Same as above just it can be taken lots of times. #ReadOnlyMany - RO for a lot of PODs. Not all type support all 3 modes #For example, block devices don't support ReadWriteMany, but File based volumes (NFS, Object Volumes) usually do. Check your plugin docks hostPath: path: "/home/ubuntu/volume1"
You can see the description for each important attritube above. So that covers the explanation :) So, how we create it then ? Well, like any other kubernetes' object, we APPLY it:
Create PV
ubuntu@k8s-master:~/volume1$ kubectl apply -f pov1.yml persistentvolume/ps-pv created ubuntu@k8s-master:~/volume1$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE ps-pv 50Gi RWO Retain Available ps-fast 3s ubuntu@k8s-master:~/volume1$ kubectl get pvc No resources found in default namespace. ubuntu@k8s-master:~/volume1$
To use it, we need a PVC, to claim it and pod so we have a container to present it to.
As we already said, PVC “claims” the PV. Without PVC, the PV is useless, without PVC the PV is useless. They need to exist together.
So let's check the specs for the PVC.
PVC specs
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ps-pvc spec: storageClassName: ps-fast accessModes: - ReadWriteOnce resources: requests: storage: 50Gi
I will not go through this description as the attributes repeat, but you get the picture. We create a claim, and we hope that some volume can fulfil it :) As again, the PVC is a first class object, we can simply “apply” it:
Create PVC
ubuntu@k8s-master:~/volume1$ kubectl apply -f pvc.yml persistentvolumeclaim/ps-pvc created ubuntu@k8s-master:~/volume1$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ps-pvc Bound ps-pv 50Gi RWO ps-fast 3s ubuntu@k8s-master:~/volume1$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE ps-pv 50Gi RWO Retain Bound default/ps-pvc ps-fast 5m8s ubuntu@k8s-master:~/volume1$
We see that, we have bounded them and we see the claim in the Persistent Volume description. It is important to note that the PVC will bind to ANY PV which has the SAME or MORE of the requested storage. For example if the PVC wants to claim 50 GBs and we have PV with 20 GBs, it will bind.
However, if our PVC wants to claim 20GBs and we have only 50 GB PV, then we are screwed and the PVC won't bind. Congrats, we have a pvc to present to any pod we want to have storage.
So let's create the Pod
As always, here example of the POD YML:
Pod Specs
apiVersion: v1 kind: Pod metadata: name: first-pod spec: volumes: - name: fast50g persistentVolumeClaim: claimName: ps-pvc containers: - name: ctr1 image: ubuntu:latest command: - /bin/bash - "-c" - "sleep 60m" volumeMounts: - mountPath: "/data" name: fast50g
Again, I think explaination is useless here as the things are self explaining and as always we can just create:
Create Pod
ubuntu@k8s-master:~/volume1$ kubectl apply -f pod.yml pod/first-pod created ubuntu@k8s-master:~/volume1$ kubectl get pods NAME READY STATUS RESTARTS AGE first-pod 1/1 Running 0 99s ubuntu@k8s-master:~/volume1$
As we installed that pod on Kubernetes with 1 master and 2 workers. It didn't had to end up on the mater, in fact it ended up on Worker 1 :) So let's check it there. On Worker1, we can list all the Pods as usual and connect to our one
List all pods
root@node-1:~# docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ca7f18335d32 ubuntu "/bin/bash -c 'sleep…" 4 minutes ago Up 4 minutes k8s_ctr1_first-pod_default_d5fc44c3-81c7-4a63-9e00-cf9650c26c58_0 d2e5fddd1d2d k8s.gcr.io/pause:3.2 "/pause" 5 minutes ago Up 5 minutes k8s_POD_first-pod_default_d5fc44c3-81c7-4a63-9e00-cf9650c26c58_0 c6694424e858 andonovj/httpserverdemo "dotnet HttpServerDe…" 2 hours ago Up 2 hours k8s_hello-pod_hello-deploy-7f44bd8b96-qz6cr_default_6fb3b168-ab19-47e9-be9c-f016f214f092_1 60ea2607bd11 andonovj/httpserverdemo "dotnet HttpServerDe…" 2 hours ago Up 2 hours k8s_hello-pod_hello-deploy-7f44bd8b96-4c76j_default_4b9f70f8-0d5d-4a19-aea0-8f393412f939_1 3ea2eeed5344 andonovj/httpserverdemo "dotnet HttpServerDe…" 2 hours ago Up 2 hours k8s_hello-pod_hello-deploy-7f44bd8b96-7tvcs_default_f4dc2924-7a87-44c9-bb5b-c3010b0451be_1 4742768bace2 andonovj/httpserverdemo "dotnet HttpServerDe…" 2 hours ago Up 2 hours k8s_hello-pod_hello-deploy-7f44bd8b96-9lnrm_default_821d3183-7c5e-413e-99a1-144bb13caff4_1 fd7678069cdd k8s.gcr.io/pause:3.2 "/pause" 2 hours ago Up 2 hours k8s_POD_hello-deploy-7f44bd8b96-qz6cr_default_6fb3b168-ab19-47e9-be9c-f016f214f092_1 598a580a0ab0 k8s.gcr.io/pause:3.2 "/pause" 2 hours ago Up 2 hours k8s_POD_hello-deploy-7f44bd8b96-4c76j_default_4b9f70f8-0d5d-4a19-aea0-8f393412f939_1 8cc487d0c45e andonovj/httpserverdemo "dotnet HttpServerDe…" 2 hours ago Up 2 hours k8s_hello-pod_hello-deploy-7f44bd8b96-gnvwr_default_f467d445-34c7-4dc9-ac37-e483be950d72_1 a64e7f2c167c k8s.gcr.io/pause:3.2 "/pause" 2 hours ago Up 2 hours k8s_POD_hello-deploy-7f44bd8b96-7tvcs_default_f4dc2924-7a87-44c9-bb5b-c3010b0451be_1 97da605cd3c7 k8s.gcr.io/pause:3.2 "/pause" 2 hours ago Up 2 hours k8s_POD_hello-deploy-7f44bd8b96-9lnrm_default_821d3183-7c5e-413e-99a1-144bb13caff4_1 e7c8dcebe1be k8s.gcr.io/pause:3.2 "/pause" 2 hours ago Up 2 hours k8s_POD_hello-deploy-7f44bd8b96-gnvwr_default_f467d445-34c7-4dc9-ac37-e483be950d72_1 71c9f4548392 0d40868643c6 "/usr/local/bin/kube…" 2 hours ago Up 2 hours k8s_kube-proxy_kube-proxy-sbtcp_kube-system_f627745a-760f-442a-a371-27350c3e638d_3 420f51aa6c56 k8s.gcr.io/pause:3.2 "/pause" 2 hours ago Up 2 hours k8s_POD_kube-proxy-sbtcp_kube-system_f627745a-760f-442a-a371-27350c3e638d_3 b9cc5715f753 3610c051aa19 "start_runit" 2 hours ago Up 2 hours k8s_calico-node_calico-node-5rqvv_kube-system_add6c10e-7693-41b0-953a-0ed2c3e2f671_3 9b86b11b3b61 k8s.gcr.io/pause:3.2 "/pause" 2 hours ago Up 2 hours k8s_POD_calico-node-5rqvv_kube-system_add6c10e-7693-41b0-953a-0ed2c3e2f671_3 root@node-1:~# docker exec -it ca7f18335d32 /bin/bash root@first-pod:/# df -h Filesystem Size Used Avail Use% Mounted on overlay 9.7G 3.9G 5.8G 40% / tmpfs 64M 0 64M 0% /dev tmpfs 730M 0 730M 0% /sys/fs/cgroup /dev/sda1 9.7G 3.9G 5.8G 40% /data shm 64M 0 64M 0% /dev/shm tmpfs 730M 12K 730M 1% /run/secrets/kubernetes.io/serviceaccount tmpfs 730M 0 730M 0% /proc/acpi tmpfs 730M 0 730M 0% /proc/scsi tmpfs 730M 0 730M 0% /sys/firmware root@first-pod:/# cd /data root@first-pod:/data# ls -alrt total 8 drwxr-xr-x 2 root root 4096 May 22 15:23 . drwxr-xr-x 1 root root 4096 May 22 15:23 .. root@first-pod:/data# pwd /data root@first-pod:/data# touch test root@first-pod:/data# ls -alrt total 8 drwxr-xr-x 1 root root 4096 May 22 15:23 .. -rw-r--r-- 1 root root 0 May 22 15:29 test drwxr-xr-x 2 root root 4096 May 22 15:29 .
So we have created a simple text file on the pod, under mount: “/data”. According our logic that file should be available on the host server under the defined volume destination:
Check the Volume
root@node-1:/home/ubuntu/volume1# hostname node-1 root@node-1:/home/ubuntu/volume1# ls -lart total 8 drwxr-xr-x 4 ubuntu ubuntu 4096 May 22 15:23 .. -rw-r--r-- 1 root root 0 May 22 15:29 test drwxr-xr-x 2 root root 4096 May 22 15:29 . root@node-1:/home/ubuntu/volume1# pwd /home/ubuntu/volume1 root@node-1:/home/ubuntu/volume1#
Lo and behold, the simple text file is on persistent storage and won't be affected if the container crashes for example. It will stay there safe and sound on the host server.
As we mentioned, that kind of persistent storage allocation DOESN'T SCALE and let's see why:
You see that, mapping and mapping and mapping :) Let's see what we can do about it.
Let's configure Dynamic storage. The idea here is that, we (as administrators) care only of the PVC, NOT the PV. So we create the PVC and the provisioner creates teh PV himself.
Now, for Dynamic provisioning with NFS I had to re-configure the cluster. In a nutshell make sure that teh API IP which you give when you initiate the cluster has the same subnet of the pod network.
For example:
Initiate Cluster for NFS
kubeadm init --ignore-preflight-errors=NumCPU --apiserver-advertise-address=192.168.50.10 --pod-network-cidr=192.168.50.0/24
Calico by default is using: 192.168.0.0/16 so I modified it to: 192.168.50.0/24 so it will match the network of the API Advertise IP.
So let's get going. In the begining, I had something like that:
Overview
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system calico-kube-controllers-77c5fc8d7f-fcnh7 1/1 Running 0 3m56s 192.168.235.195 k8s-master <none> <none> kube-system calico-node-94qkt 1/1 Running 0 66s 192.168.50.12 node-2 <none> <none> kube-system calico-node-j54sq 1/1 Running 0 2m18s 192.168.50.11 node-1 <none> <none> kube-system calico-node-rc4t6 1/1 Running 0 3m56s 192.168.50.10 k8s-master <none> <none> kube-system coredns-66bff467f8-d7hr5 1/1 Running 0 10m 192.168.235.193 k8s-master <none> <none> kube-system coredns-66bff467f8-jmwk7 1/1 Running 0 10m 192.168.235.194 k8s-master <none> <none> kube-system etcd-k8s-master 1/1 Running 0 10m 192.168.50.10 k8s-master <none> <none> kube-system kube-apiserver-k8s-master 1/1 Running 0 10m 192.168.50.10 k8s-master <none> <none> kube-system kube-controller-manager-k8s-master 1/1 Running 0 10m 192.168.50.10 k8s-master <none> <none> kube-system kube-proxy-8td28 1/1 Running 0 66s 192.168.50.12 node-2 <none> <none> kube-system kube-proxy-bljr8 1/1 Running 0 10m 192.168.50.10 k8s-master <none> <none> kube-system kube-proxy-dcnqt 1/1 Running 0 2m18s 192.168.50.11 node-1 <none> <none> kube-system kube-scheduler-k8s-master 1/1 Running 0 10m 192.168.50.10 k8s-master <none> <none> ubuntu@k8s-master:~$
So we have to create the following:
You can see the deployment,service and service account's YML below:
Components YML
apiVersion: v1 kind: ServiceAccount metadata: name: nfs-provisioner --- kind: Service apiVersion: v1 metadata: name: nfs-provisioner labels: app: nfs-provisioner spec: ports: - name: nfs port: 2049 - name: nfs-udp port: 2049 protocol: UDP - name: nlockmgr port: 32803 - name: nlockmgr-udp port: 32803 protocol: UDP - name: mountd port: 20048 - name: mountd-udp port: 20048 protocol: UDP - name: rquotad port: 875 - name: rquotad-udp port: 875 protocol: UDP - name: rpcbind port: 111 - name: rpcbind-udp port: 111 protocol: UDP - name: statd port: 662 - name: statd-udp port: 662 protocol: UDP selector: app: nfs-provisioner --- kind: Deployment apiVersion: apps/v1 metadata: name: nfs-provisioner spec: selector: matchLabels: app: nfs-provisioner replicas: 1 strategy: type: Recreate template: metadata: labels: app: nfs-provisioner spec: serviceAccount: nfs-provisioner containers: - name: nfs-provisioner image: quay.io/kubernetes_incubator/nfs-provisioner:latest ports: - name: nfs containerPort: 2049 - name: nfs-udp containerPort: 2049 protocol: UDP - name: nlockmgr containerPort: 32803 - name: nlockmgr-udp containerPort: 32803 protocol: UDP - name: mountd containerPort: 20048 - name: mountd-udp containerPort: 20048 protocol: UDP - name: rquotad containerPort: 875 - name: rquotad-udp containerPort: 875 protocol: UDP - name: rpcbind containerPort: 111 - name: rpcbind-udp containerPort: 111 protocol: UDP - name: statd containerPort: 662 - name: statd-udp containerPort: 662 protocol: UDP securityContext: capabilities: add: - DAC_READ_SEARCH - SYS_RESOURCE args: - "-provisioner=example.com/nfs" env: - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: SERVICE_NAME value: nfs-provisioner - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace imagePullPolicy: "IfNotPresent" volumeMounts: - name: export-volume mountPath: /export volumes: - name: export-volume hostPath: path: /srv ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$ kubectl create -f deployment.yaml serviceaccount/nfs-provisioner created service/nfs-provisioner created deployment.apps/nfs-provisioner created
Then let's create the RBAC, which created the Cluster roles and maps them and of course the storage class
Create RBAC & Storage Class
kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: nfs-provisioner-runner rules: - apiGroups: [""] resources: ["persistentvolumes"] verbs: ["get", "list", "watch", "create", "delete"] - apiGroups: [""] resources: ["persistentvolumeclaims"] verbs: ["get", "list", "watch", "update"] - apiGroups: ["storage.k8s.io"] resources: ["storageclasses"] verbs: ["get", "list", "watch"] - apiGroups: [""] resources: ["events"] verbs: ["create", "update", "patch"] - apiGroups: [""] resources: ["services", "endpoints"] verbs: ["get"] - apiGroups: ["extensions"] resources: ["podsecuritypolicies"] resourceNames: ["nfs-provisioner"] verbs: ["use"] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: run-nfs-provisioner subjects: - kind: ServiceAccount name: nfs-provisioner # replace with namespace where provisioner is deployed namespace: default roleRef: kind: ClusterRole name: nfs-provisioner-runner apiGroup: rbac.authorization.k8s.io --- kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: leader-locking-nfs-provisioner rules: - apiGroups: [""] resources: ["endpoints"] verbs: ["get", "list", "watch", "create", "update", "patch"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: leader-locking-nfs-provisioner subjects: - kind: ServiceAccount name: nfs-provisioner # replace with namespace where provisioner is deployed namespace: default roleRef: kind: Role name: leader-locking-nfs-provisioner apiGroup: rbac.authorization.k8s.io ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$ cat class.yaml kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: example-nfs provisioner: example.com/nfs mountOptions: - vers=4.1 ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$ kubectl create -f rbac.yaml clusterrole.rbac.authorization.k8s.io/nfs-provisioner-runner created clusterrolebinding.rbac.authorization.k8s.io/run-nfs-provisioner created role.rbac.authorization.k8s.io/leader-locking-nfs-provisioner created rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-provisioner created ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$ kubectl create -f class.yaml storageclass.storage.k8s.io/example-nfs created
With Dynamic provisioning, we DON'T Create the Volume, we create ONLY the claim, the volume is created automatically by the provision, that is the MAIN difference
Create Claim
ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$ cat claim.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs annotations: volume.beta.kubernetes.io/storage-class: "example-nfs" spec: accessModes: - ReadWriteMany resources: requests: storage: 10Mi ubuntu@k8s-master:~/
We can verify the configuration as follows:
Create Claim
ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$ kubectl get pv,pvc NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/pvc-9a8aa090-7c73-4e64-94eb-dcc7805828dd 10Mi RWX Delete Bound default/nfs example-nfs 21m NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/nfs Bound pvc-9a8aa090-7c73-4e64-94eb-dcc7805828dd 10Mi RWX example-nfs 21m ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$
Finally, we have a bounded PVC using Dynamic Provision. There is one very good git with all this files:
Configure it with GIT
ubuntu@k8s-master:~$ git clone https://github.com/kubernetes-incubator/external-storage.git^C ubuntu@k8s-master:~$ ubuntu@k8s-master:~$ ls -lart total 32 -rw-r--r-- 1 ubuntu ubuntu 3771 Aug 31 2015 .bashrc -rw-r--r-- 1 ubuntu ubuntu 220 Aug 31 2015 .bash_logout -rw-r--r-- 1 ubuntu ubuntu 655 Jul 12 2019 .profile drwxr-xr-x 4 root root 4096 May 25 10:51 .. drwx------ 2 ubuntu ubuntu 4096 May 25 10:51 .ssh -rw-r--r-- 1 ubuntu ubuntu 0 May 25 11:21 .sudo_as_admin_successful drwxrwxr-x 4 ubuntu ubuntu 4096 May 25 11:22 .kube drwxr-xr-x 5 ubuntu ubuntu 4096 May 25 11:27 . drwxrwxr-x 17 ubuntu ubuntu 4096 May 25 11:27 external-storage <---- This one
We can of course create a pod which will be using the NFS, let's create NGINX pod for example:
Create NGINX Pod
apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name: nfs-nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: volumes: - name: nfs # persistentVolumeClaim: claimName: nfs # same name of pvc that was created containers: - image: nginx name: nginx volumeMounts: - name: nfs # name of volume should match claimName volume mountPath: mydata2 # mount inside of contianer ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$ kubectl create -f nginx.yml deployment.apps/nfs-nginx created ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nfs-nginx-6b4db6f57-4mczr 1/1 Running 0 2m51s 192.168.247.2 node-2 <none> <none> nfs-provisioner-7795cf6f4-d7m2l 1/1 Running 0 67m 192.168.247.1 node-2 <none> <none> ubuntu@k8s-master:~/external-storage/nfs/deploy/kubernetes$
Even an ubuntu pod:
ubuntu pod
ubuntu@k8s-master:~$ cat pod.yml apiVersion: v1 kind: Pod metadata: name: first-pod spec: volumes: - name: fast10m persistentVolumeClaim: claimName: nfs containers: - name: ctr1 image: ubuntu:latest command: - /bin/bash - "-c" - "sleep 60m" volumeMounts: - mountPath: "/data" name: fast10m ubuntu@k8s-master:~$ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES first-pod 1/1 Running 0 24s 192.168.84.131 node-1 <none> <none> nfs-nginx-6b4db6f57-4mczr 1/1 Running 0 4m16s 192.168.247.2 node-2 <none> <none> nfs-provisioner-7795cf6f4-d7m2l 1/1 Running 0 69m 192.168.247.1 node-2 <none> <none> ubuntu@k8s-master:~$
Eurika, finally we are done with both types of privisioning