14. Self-Healing¶
Get pod details
$ kubectl get pods -o wide
Get first nginx pod and delete it - one of the nginx pods should be in ‘Terminating’ status
$ NGINX_POD=$(kubectl get pods -l app=nginx --output=jsonpath="{.items[0].metadata.name}")
$ kubectl delete pod $NGINX_POD; kubectl get pods -l app=nginx -o wide
$ sleep 10
Get pod details - one nginx pod should be freshly started
$ kubectl get pods -l app=nginx -o wide
Get deployement details and check the events for recent changes
$ kubectl describe deployment nginx-deployment
Halt one of the nodes (node2)
$ vagrant halt node2
$ sleep 30
Get node details - node2 Status=NotReady
$ kubectl get nodes
Get pod details - everything looks fine - you need to wait 5 minutes
$ kubectl get pods -o wide
Pod will not be evicted until it is 5 minutes old - (see Tolerations in ‘describe pod’ ). It prevents Kubernetes to spin up the new containers when it is not necessary
$ NGINX_POD=$(kubectl get pods -l app=nginx --output=jsonpath="{.items[0].metadata.name}")
$ kubectl describe pod $NGINX_POD | grep -A1 Tolerations
Sleeping for 5 minutes
$ sleep 300
Get pods details - Status=Unknown/NodeLost and new container was started
$ kubectl get pods -o wide
Get depoyment details - again AVAILABLE=3/3
$ kubectl get deployments -o wide
Power on the node2 node
$ vagrant up node2
$ sleep 70
Get node details - node2 should be Ready again
$ kubectl get nodes
Get pods details - ‘Unknown’ pods were removed
$ kubectl get pods -o wide