Kubernetes - Autoscaling

Autoscaling is one of the key features in Kubernetes cluster. It is a feature in which the cluster is capable of increasing the number of nodes as the demand for service response increases and decrease the number of nodes as the requirement decreases. This feature of auto scaling is currently supported in Google Cloud Engine (GCE) and Google Container Engine (GKE) and will start with AWS pretty soon.

In order to set up scalable infrastructure in GCE, we need to first have an active GCE project with features of Google cloud monitoring, google cloud logging, and stackdriver enabled.

First, we will set up the cluster with few nodes running in it. Once done, we need to set up the following environment variable.

Environment Variable

export NUM_NODES = 2
export KUBE_AUTOSCALER_MIN_NODES = 2
export KUBE_AUTOSCALER_MAX_NODES = 5
export KUBE_ENABLE_CLUSTER_AUTOSCALER = true

Once done, we will start the cluster by running kube-up.sh. This will create a cluster together with cluster auto-scalar add on.

./cluster/kube-up.sh

On creation of the cluster, we can check our cluster using the following kubectl command.

$ kubectl get nodes
NAME                             STATUS                       AGE
kubernetes-master                Ready,SchedulingDisabled     10m
kubernetes-minion-group-de5q     Ready                        10m
kubernetes-minion-group-yhdx     Ready                        8m

Now, we can deploy an application on the cluster and then enable the horizontal pod autoscaler. This can be done using the following command.

$ kubectl autoscale deployment <Application Name> --cpu-percent = 50 --min = 1 --
max = 10

The above command shows that we will maintain at least one and maximum 10 replica of the POD as the load on the application increases.

We can check the status of autoscaler by running the $kubclt get hpa command. We will increase the load on the pods using the following command.

$ kubectl run -i --tty load-generator --image = busybox /bin/sh
$ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

We can check the hpa by running $ kubectl get hpa command.

$ kubectl get hpa
NAME         REFERENCE                     TARGET CURRENT
php-apache   Deployment/php-apache/scale    50%    310%

MINPODS  MAXPODS   AGE
  1        20      2m
  
$ kubectl get deployment php-apache
NAME         DESIRED    CURRENT    UP-TO-DATE    AVAILABLE   AGE
php-apache      7          7           7            3        4m

We can check the number of pods running using the following command.

jsz@jsz-desk2:~/k8s-src$ kubectl get pods
php-apache-2046965998-3ewo6 0/1        Pending 0         1m
php-apache-2046965998-8m03k 1/1        Running 0         1m
php-apache-2046965998-ddpgp 1/1        Running 0         5m
php-apache-2046965998-lrik6 1/1        Running 0         1m
php-apache-2046965998-nj465 0/1        Pending 0         1m
php-apache-2046965998-tmwg1 1/1        Running 0         1m
php-apache-2046965998-xkbw1 0/1        Pending 0         1m

And finally, we can get the node status.

$ kubectl get nodes
NAME                             STATUS                        AGE
kubernetes-master                Ready,SchedulingDisabled      9m
kubernetes-minion-group-6z5i     Ready                         43s
kubernetes-minion-group-de5q     Ready                         9m
kubernetes-minion-group-yhdx     Ready                         9m