From ed2dddaac092780897a0e420a726c4a49794a616 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Wed, 27 Dec 2023 11:33:16 +0530 Subject: [PATCH 01/17] autoscaler start --- Autoscaler101/what-are-autoscalers.md | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 Autoscaler101/what-are-autoscalers.md diff --git a/Autoscaler101/what-are-autoscalers.md b/Autoscaler101/what-are-autoscalers.md new file mode 100644 index 00000000..78c91946 --- /dev/null +++ b/Autoscaler101/what-are-autoscalers.md @@ -0,0 +1,3 @@ +# Autoscalers + +You likely already know about scalers and what they do, since they are a core part of the Kubernetes architecture. Primary to these scalers are two scaling methods: vertical scaling, and horizontal scaling. In this section, we will dive deep into each of these types of scaling and have a hands-on look at the way that each functions. We will also see the benefits each method has, as well as the drawbacks. \ No newline at end of file From 204795ea2c3518bd1c7d9e174ec259f57658c418 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 28 Dec 2023 12:05:18 +0530 Subject: [PATCH 02/17] autoscaler cont. --- Autoscaler101/what-are-autoscalers.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/what-are-autoscalers.md b/Autoscaler101/what-are-autoscalers.md index 78c91946..1c1116e1 100644 --- a/Autoscaler101/what-are-autoscalers.md +++ b/Autoscaler101/what-are-autoscalers.md @@ -1,3 +1,7 @@ # Autoscalers -You likely already know about scalers and what they do, since they are a core part of the Kubernetes architecture. Primary to these scalers are two scaling methods: vertical scaling, and horizontal scaling. In this section, we will dive deep into each of these types of scaling and have a hands-on look at the way that each functions. We will also see the benefits each method has, as well as the drawbacks. \ No newline at end of file +You likely already know about scalers and what they do, since they are a core part of the Kubernetes architecture. Primary to these scalers are two scaling methods: vertical scaling, and horizontal scaling. In this section, we will dive deep into each of these types of scaling and have a hands-on look at the way that each functions. We will also see the benefits each method has, as well as the drawbacks. + +## Vertical pod autoscaler + +A vertical pod autoscaler works by collecting metrics (using the metrics server), and then analyzing those metrics over a period of time to understand the resource requirements of the running pods. It considers factors such as historical usage patterns, spikes in resource consumption, and the configured target utilization levels. Once this analysis is complete, the VPA controller generates recommendations for adjusting the resource requests (CPU and memory) of the pods. It may recommend increasing or decreasing resource requests to better match the observed usage. \ No newline at end of file From 8ad5f7bbc314f60fdcc4608396b337d89822fccd Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 29 Dec 2023 12:19:57 +0530 Subject: [PATCH 03/17] autoscaler cont. --- Autoscaler101/what-are-autoscalers.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/what-are-autoscalers.md b/Autoscaler101/what-are-autoscalers.md index 1c1116e1..bbd535ac 100644 --- a/Autoscaler101/what-are-autoscalers.md +++ b/Autoscaler101/what-are-autoscalers.md @@ -4,4 +4,6 @@ You likely already know about scalers and what they do, since they are a core pa ## Vertical pod autoscaler -A vertical pod autoscaler works by collecting metrics (using the metrics server), and then analyzing those metrics over a period of time to understand the resource requirements of the running pods. It considers factors such as historical usage patterns, spikes in resource consumption, and the configured target utilization levels. Once this analysis is complete, the VPA controller generates recommendations for adjusting the resource requests (CPU and memory) of the pods. It may recommend increasing or decreasing resource requests to better match the observed usage. \ No newline at end of file +A vertical pod autoscaler works by collecting metrics (using the metrics server), and then analyzing those metrics over a period of time to understand the resource requirements of the running pods. It considers factors such as historical usage patterns, spikes in resource consumption, and the configured target utilization levels. Once this analysis is complete, the VPA controller generates recommendations for adjusting the resource requests (CPU and memory) of the pods. It may recommend increasing or decreasing resource requests to better match the observed usage. This is the basis of how a VPA works. However, this is not the end of the job for the VPA, as it has to constantly monitor and create a feedback loop where the VPA regularly adjusts pod resources based on the latest metrics. + +As you might already know, these steps are also largely performed by the Horizontal pod autoscaler as well. What differentiates the VPA from the HPA is how scaling is performed. With a VPA, the autoscaler recommends changes to a pod's resource requirements, it does so by modifying the pod's associated resource settings in the deployment or StatefulSet manifest. This triggers Kubernetes to create new pods with the updated resource specifications and gradually replace the existing pods. So it will perform a rolling update where the old pod with insufficient resources is replaced with a new pod that has the required resource allocation. \ No newline at end of file From a7e57df599c7b2923464fa9e48a50405946adabf Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 2 Jan 2024 14:29:13 +0530 Subject: [PATCH 04/17] autoscaler cont. --- Autoscaler101/what-are-autoscalers.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/what-are-autoscalers.md b/Autoscaler101/what-are-autoscalers.md index bbd535ac..87ffe7e5 100644 --- a/Autoscaler101/what-are-autoscalers.md +++ b/Autoscaler101/what-are-autoscalers.md @@ -6,4 +6,12 @@ You likely already know about scalers and what they do, since they are a core pa A vertical pod autoscaler works by collecting metrics (using the metrics server), and then analyzing those metrics over a period of time to understand the resource requirements of the running pods. It considers factors such as historical usage patterns, spikes in resource consumption, and the configured target utilization levels. Once this analysis is complete, the VPA controller generates recommendations for adjusting the resource requests (CPU and memory) of the pods. It may recommend increasing or decreasing resource requests to better match the observed usage. This is the basis of how a VPA works. However, this is not the end of the job for the VPA, as it has to constantly monitor and create a feedback loop where the VPA regularly adjusts pod resources based on the latest metrics. -As you might already know, these steps are also largely performed by the Horizontal pod autoscaler as well. What differentiates the VPA from the HPA is how scaling is performed. With a VPA, the autoscaler recommends changes to a pod's resource requirements, it does so by modifying the pod's associated resource settings in the deployment or StatefulSet manifest. This triggers Kubernetes to create new pods with the updated resource specifications and gradually replace the existing pods. So it will perform a rolling update where the old pod with insufficient resources is replaced with a new pod that has the required resource allocation. \ No newline at end of file +As you might already know, these steps are also largely performed by the Horizontal pod autoscaler as well. What differentiates the VPA from the HPA is how scaling is performed. With a VPA, the autoscaler recommends changes to a pod's resource requirements, it does so by modifying the pod's associated resource settings in the deployment or StatefulSet manifest. This triggers Kubernetes to create new pods with the updated resource specifications and gradually replace the existing pods. So it will perform a rolling update where the old pod with insufficient resources is replaced with a new pod that has the required resource allocation. + +Scaling down happens in the same way, where the VPA dynamically updates the resource specifications of existing pods. When scaling down, it may reduce the requested CPU or memory resources if historical metrics indicate that the pod consistently uses less than initially requested. Then, the VPA indirectly scales down by updating the resource settings in the pod's associated deployment or stateful set manifest. It then triggers a controlled rolling update, creating new pods with updated resource specifications while phasing out the old ones. + +## Horizontal pod autoscaler + +A horizontal pod autoscaler works in the same way as a VPA for the most part. It continuously monitors specified metrics, such as CPU utilization or custom metrics, for the pods it is scaling. You define a target value for the chosen metric. For example, you might set a target CPU utilization percentage. Based on the observed metrics and the defined target value, HPA makes a scaling decision to either increase or decrease the number of pod replicas. The amount of resources allocated to each pod remains the same. The number of pods will increase to accommodate this influx. If there is a service associated with the pod, the service will automatically start load balancing across the pod replicas without any intervention from your side. + +Scaling down is handled in roughly the same way. When scaling down, HPA reduces the number of pod replicas. It terminates existing pods to bring the number of replicas in line with the configured target metric. The scaling decision is based on the comparison of the observed metric with the target value. HPA does not modify the resource specifications (CPU and memory requests/limits) of individual pods. Instead, it adjusts the number of replicas to match the desired metric target. \ No newline at end of file From 50d97bcb86aeeab1906ec39c9c6864e71272c2e8 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Wed, 3 Jan 2024 12:10:43 +0530 Subject: [PATCH 05/17] autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 5 +++++ Autoscaler101/what-are-autoscalers.md | 6 +++++- 2 files changed, 10 insertions(+), 1 deletion(-) create mode 100644 Autoscaler101/autoscaler-lab.md diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md new file mode 100644 index 00000000..ee991788 --- /dev/null +++ b/Autoscaler101/autoscaler-lab.md @@ -0,0 +1,5 @@ +# Lab + +## Requirements + +You will need a Kubernetes cluster. A single node [Minikube cluster](https://minikube.sigs.k8s.io/docs/start/) will do just fine. \ No newline at end of file diff --git a/Autoscaler101/what-are-autoscalers.md b/Autoscaler101/what-are-autoscalers.md index 87ffe7e5..cf4c5e91 100644 --- a/Autoscaler101/what-are-autoscalers.md +++ b/Autoscaler101/what-are-autoscalers.md @@ -14,4 +14,8 @@ Scaling down happens in the same way, where the VPA dynamically updates the reso A horizontal pod autoscaler works in the same way as a VPA for the most part. It continuously monitors specified metrics, such as CPU utilization or custom metrics, for the pods it is scaling. You define a target value for the chosen metric. For example, you might set a target CPU utilization percentage. Based on the observed metrics and the defined target value, HPA makes a scaling decision to either increase or decrease the number of pod replicas. The amount of resources allocated to each pod remains the same. The number of pods will increase to accommodate this influx. If there is a service associated with the pod, the service will automatically start load balancing across the pod replicas without any intervention from your side. -Scaling down is handled in roughly the same way. When scaling down, HPA reduces the number of pod replicas. It terminates existing pods to bring the number of replicas in line with the configured target metric. The scaling decision is based on the comparison of the observed metric with the target value. HPA does not modify the resource specifications (CPU and memory requests/limits) of individual pods. Instead, it adjusts the number of replicas to match the desired metric target. \ No newline at end of file +Scaling down is handled in roughly the same way. When scaling down, HPA reduces the number of pod replicas. It terminates existing pods to bring the number of replicas in line with the configured target metric. The scaling decision is based on the comparison of the observed metric with the target value. HPA does not modify the resource specifications (CPU and memory requests/limits) of individual pods. Instead, it adjusts the number of replicas to match the desired metric target. + +Now that we have thoroughly explored both types of autoscalers, let's go on to a lab where we will look at the scalers in more detail. + +[Next: Autoscaler lab](../Autoscaler101/autoscaler-lab.md) \ No newline at end of file From d6f0e906d2f743014db5a505754fb43f77b93862 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 5 Jan 2024 11:30:54 +0530 Subject: [PATCH 06/17] autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 123 +++++++++++++++++++++++++++++++- 1 file changed, 121 insertions(+), 2 deletions(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index ee991788..a544465b 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -1,5 +1,124 @@ # Lab -## Requirements +You will need a Kubernetes cluster. A single node [Minikube cluster](https://minikube.sigs.k8s.io/docs/start/) will do just fine. Once the cluster is setup, we can go ahead and jump right into the lab since all other requirements to get an autoscaler up and running is already present within Kubernetes itself. -You will need a Kubernetes cluster. A single node [Minikube cluster](https://minikube.sigs.k8s.io/docs/start/) will do just fine. \ No newline at end of file + +``` +apiVersion: autoscaling.k8s.io/v1 +kind: VerticalPodAutoscaler +metadata: + name: example-vpa +spec: + targetRef: + apiVersion: "apps/v1" + kind: Deployment + name: example-deployment + updatePolicy: + updateMode: "Auto" + resourcePolicy: + containerPolicies: + - containerName: "*" # Apply policies to all containers in the pod + minAllowed: + cpu: 50m + memory: 64Mi + maxAllowed: + cpu: 500m + memory: 512Mi +``` + +``` +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + containers: + - name: nginx-container + image: nginx:1.21.5 + resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 200m + memory: 256Mi +--- +apiVersion: v1 +kind: Service +metadata: + name: nginx-service +spec: + selector: + app: nginx + ports: + - protocol: TCP + port: 80 + targetPort: 80 +``` + +``` +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + containers: + - name: nginx-container + image: nginx:1.21.5 + resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 200m + memory: 256Mi +--- +apiVersion: v1 +kind: Service +metadata: + name: nginx-service +spec: + selector: + app: nginx + ports: + - protocol: TCP + port: 80 + targetPort: 80 +--- +apiVersion: autoscaling/v2beta2 +kind: HorizontalPodAutoscaler +metadata: + name: nginx-hpa +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: nginx-deployment + minReplicas: 2 + maxReplicas: 5 + metrics: + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: 80 +``` \ No newline at end of file From 69619e8b66c2d7c6a50d01704a88367372ec32b5 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sat, 6 Jan 2024 12:23:18 +0530 Subject: [PATCH 07/17] Added deployment --- Autoscaler101/autoscaler-lab.md | 37 +++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index a544465b..1b56d053 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -2,6 +2,43 @@ You will need a Kubernetes cluster. A single node [Minikube cluster](https://minikube.sigs.k8s.io/docs/start/) will do just fine. Once the cluster is setup, we can go ahead and jump right into the lab since all other requirements to get an autoscaler up and running is already present within Kubernetes itself. +We will start with a base application that will have the scaling performed in it. In this case, we will use a sample nginx deployment. Create a file `nginx-deployment.yaml` and paste the below contents to it: + + +``` +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + containers: + - name: nginx-container + image: nginx:1.21.5 + resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 200m + memory: 256Mi +--- + +This will start an nginx container that has at least 100m CPU & 128Mb memory, but not more than 200m CPU and 256Mb memory. Deploy this application onto your Kubernetes cluster: + +``` +kubectl apply -f nginx-deployment.yaml +``` + +Now, when the application reaches the CPU or memory limit, it will affect application performance since it is not allowed to go beyond that. So let's introduce the autoscaler. ``` apiVersion: autoscaling.k8s.io/v1 From dfee5fb83775f7bbb04fba81d6a37f96ecafc98f Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sun, 7 Jan 2024 10:44:37 +0530 Subject: [PATCH 08/17] autoscaler cont --- Autoscaler101/autoscaler-lab.md | 57 ++++++++++----------------------- 1 file changed, 17 insertions(+), 40 deletions(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index 1b56d053..8f6d3b4f 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -5,6 +5,7 @@ You will need a Kubernetes cluster. A single node [Minikube cluster](https://min We will start with a base application that will have the scaling performed in it. In this case, we will use a sample nginx deployment. Create a file `nginx-deployment.yaml` and paste the below contents to it: + ``` apiVersion: apps/v1 kind: Deployment @@ -31,14 +32,26 @@ spec: cpu: 200m memory: 256Mi --- +apiVersion: v1 +kind: Service +metadata: + name: nginx-service +spec: + selector: + app: nginx + ports: + - protocol: TCP + port: 80 + targetPort: 80 +``` -This will start an nginx container that has at least 100m CPU & 128Mb memory, but not more than 200m CPU and 256Mb memory. Deploy this application onto your Kubernetes cluster: +This will start an nginx container that has at least 100m CPU & 128Mb memory, but not more than 200m CPU and 256Mb memory. It will also start the service that points to this deployment on port 80. Deploy this application onto your Kubernetes cluster: ``` kubectl apply -f nginx-deployment.yaml ``` -Now, when the application reaches the CPU or memory limit, it will affect application performance since it is not allowed to go beyond that. So let's introduce the autoscaler. +Now, when the application reaches the CPU or memory limit, it will affect application performance since it is not allowed to go beyond that. So let's introduce the autoscaler. We will start with the vertical pod autoscaler. Create a new file called "nginx-vpa.yaml" and paste the contents of the below script there. ``` apiVersion: autoscaling.k8s.io/v1 @@ -63,44 +76,8 @@ spec: memory: 512Mi ``` -``` -apiVersion: apps/v1 -kind: Deployment -metadata: - name: nginx-deployment -spec: - replicas: 3 - selector: - matchLabels: - app: nginx - template: - metadata: - labels: - app: nginx - spec: - containers: - - name: nginx-container - image: nginx:1.21.5 - resources: - requests: - cpu: 100m - memory: 128Mi - limits: - cpu: 200m - memory: 256Mi ---- -apiVersion: v1 -kind: Service -metadata: - name: nginx-service -spec: - selector: - app: nginx - ports: - - protocol: TCP - port: 80 - targetPort: 80 -``` +The resource itself is fairly self-explanatory. A vpa with name "example-vpa" will get created. + ``` apiVersion: apps/v1 From 45f8618aaaace1c26b4cd39abb149cc0e4e9e8e6 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 9 Jan 2024 11:46:54 +0530 Subject: [PATCH 09/17] autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 51 +++++++-------------------------- 1 file changed, 11 insertions(+), 40 deletions(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index 8f6d3b4f..f39f4d9a 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -57,12 +57,12 @@ Now, when the application reaches the CPU or memory limit, it will affect applic apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: - name: example-vpa + name: nginx-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment - name: example-deployment + name: nginx-deployment updatePolicy: updateMode: "Auto" resourcePolicy: @@ -76,47 +76,18 @@ spec: memory: 512Mi ``` -The resource itself is fairly self-explanatory. A vpa with name "example-vpa" will get created. +The resource itself is fairly self-explanatory. The spec section contains the specifications for the VPA. The targetRef section specifies the workload that the VPA is targeting for autoscaling. In this example, it's targeting a Deployment named "nginx-deployment." The updatePolicy section configures the update mode. In "Auto" mode, VPA automatically applies the recommended changes to the pod resources without manual intervention. The resourcePolicy section specifies the resource policies for individual containers within the pod. Within it, you have the containerPolicies section which defines policies for containers. In this case, it uses a wildcard ("*") to apply policies to all containers in the pod. It also has the minAllowed section which specifies the minimum allowed resources. VPA won't recommend going below these values. For example, the minimum allowed CPU is 50 milliCPU (50m), and the minimum allowed memory is 64 megabytes (64Mi). The maxAllowed section specifies the maximum allowed resources. VPA won't recommend going above these values. For example, the maximum allowed CPU is 500 milliCPU (500m), and the maximum allowed memory is 512 megabytes (512Mi). +Now deploy this into the Kubernetes cluster: ``` -apiVersion: apps/v1 -kind: Deployment -metadata: - name: nginx-deployment -spec: - replicas: 3 - selector: - matchLabels: - app: nginx - template: - metadata: - labels: - app: nginx - spec: - containers: - - name: nginx-container - image: nginx:1.21.5 - resources: - requests: - cpu: 100m - memory: 128Mi - limits: - cpu: 200m - memory: 256Mi ---- -apiVersion: v1 -kind: Service -metadata: - name: nginx-service -spec: - selector: - app: nginx - ports: - - protocol: TCP - port: 80 - targetPort: 80 ---- +kubectl apply -f nginx-vpa.yaml +``` + +Once the deployment is complete, we need to load-test the deployment to see the VPA in action. An important thing to note here is that if you placed the VPA memory/CPU limit too low, this will result in the pod starting up replicas immediately upon pod creation since the limit will be reached as soon as the pod comes up. This is why it is important to be aware of your average and peak loads before you begin implementing the VPA. + +To load test the deployment, we will be using Apache Benchmark. Install it with `apt` or `yum`. You can do the installation on the Kubernetes node that has started. Next, note down the URL you want to load test. + apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: From 93167b6ffb8e4a75c8512ab3f099738f573dfb31 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 11 Jan 2024 13:07:07 +0530 Subject: [PATCH 10/17] autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 40 ++++++++++++++++++++++++++++----- 1 file changed, 35 insertions(+), 5 deletions(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index f39f4d9a..1cff7994 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -1,6 +1,10 @@ # Lab -You will need a Kubernetes cluster. A single node [Minikube cluster](https://minikube.sigs.k8s.io/docs/start/) will do just fine. Once the cluster is setup, we can go ahead and jump right into the lab since all other requirements to get an autoscaler up and running is already present within Kubernetes itself. +You will need a Kubernetes cluster. A single node [Minikube cluster](https://minikube.sigs.k8s.io/docs/start/) will do just fine. Once the cluster is setup, you will have to install the metrics server, since the autoscalers use this to read the resource usage metrics. To do this, run: + +``` +kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml +``` We will start with a base application that will have the scaling performed in it. In this case, we will use a sample nginx deployment. Create a file `nginx-deployment.yaml` and paste the below contents to it: @@ -28,9 +32,6 @@ spec: requests: cpu: 100m memory: 128Mi - limits: - cpu: 200m - memory: 256Mi --- apiVersion: v1 kind: Service @@ -86,7 +87,36 @@ kubectl apply -f nginx-vpa.yaml Once the deployment is complete, we need to load-test the deployment to see the VPA in action. An important thing to note here is that if you placed the VPA memory/CPU limit too low, this will result in the pod starting up replicas immediately upon pod creation since the limit will be reached as soon as the pod comes up. This is why it is important to be aware of your average and peak loads before you begin implementing the VPA. -To load test the deployment, we will be using Apache Benchmark. Install it with `apt` or `yum`. You can do the installation on the Kubernetes node that has started. Next, note down the URL you want to load test. +To load test the deployment, we will be using Apache Benchmark. Install it with `apt` or `yum`. You can do the installation on the Kubernetes node that has started. Next, note down the URL you want to load-test. To get this, use: + +``` +kubectl get svc +``` + +This will list all the services. Pick the nginx service from this list, copy its IP, and use Benchmark as below: + +``` +ab -n 1000 -c 50 http:/// +``` + +This command will send 1000 requests with a concurrency of 50 to the NGINX service. You can adjust the -n (total requests) and -c (concurrency) parameters based on your specific load testing requirements. You can then analyze the results. Apache Benchmark will provide detailed output, including request per second (RPS), connection times, and more. For example: + +``` +Connection Times (ms) + min mean[+/-sd] median max +Connect: 0 1 2.8 0 10 +Processing: 104 271 144.3 217 1184 +Waiting: 104 270 144.2 217 1184 +Total: 104 272 144.5 217 1185 +``` + +Now it's time to check if autoscaling has started: + +``` +kubectl get po -n default +``` + +Watch the pods, and you will see that the resource limits are reached, after which a new pod with more resources is created. Keep an eye on the resource usage and you will notice that the new resources have higher limits. apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler From 40a44e61afd02b25790b1c0ba6276adf45af5986 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 12 Jan 2024 11:23:43 +0530 Subject: [PATCH 11/17] autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index 1cff7994..f6f59f15 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -116,7 +116,9 @@ Now it's time to check if autoscaling has started: kubectl get po -n default ``` -Watch the pods, and you will see that the resource limits are reached, after which a new pod with more resources is created. Keep an eye on the resource usage and you will notice that the new resources have higher limits. +Watch the pods, and you will see that the resource limits are reached, after which a new pod with more resources is created. Keep an eye on the resource usage and you will notice that the new resources have higher limits. Once the requests have been handled, the pod will immediately reduce the resource consumption. However, a new pod with lower resource requirements will not show up to replace the old pod. In fact, if you were to push a new version of the deployment into the cluster, it would still have space for a large amount of requests. However, this will reduce eventually if the amount of resources consumed continues to be low. + +Now that we have gotten a complete look at the vertical pod autoscaler, let's take a look at the HPA. apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler From 24e2c0cc1a65a574c096590ae9a15f3bb0ee35d0 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sun, 14 Jan 2024 12:21:15 +0530 Subject: [PATCH 12/17] Autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index f6f59f15..e4aff627 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -118,7 +118,7 @@ kubectl get po -n default Watch the pods, and you will see that the resource limits are reached, after which a new pod with more resources is created. Keep an eye on the resource usage and you will notice that the new resources have higher limits. Once the requests have been handled, the pod will immediately reduce the resource consumption. However, a new pod with lower resource requirements will not show up to replace the old pod. In fact, if you were to push a new version of the deployment into the cluster, it would still have space for a large amount of requests. However, this will reduce eventually if the amount of resources consumed continues to be low. -Now that we have gotten a complete look at the vertical pod autoscaler, let's take a look at the HPA. +Now that we have gotten a complete look at the vertical pod autoscaler, let's take a look at the HPA. Create a file nginx-hpa.yml and paste the below contents into it. apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler @@ -138,4 +138,26 @@ spec: target: type: Utilization averageUtilization: 80 -``` \ No newline at end of file +``` + +TODO: ADD HPA description + +Before you deploy this file into your cluster, make sure to remove the VPA since having two types of autoscalers running for the same pod can cause some obvious problems. So first run: + +``` +kubectl delete -f nginx-vpa.yaml +``` + +Then deploy the HPA: + +``` +kubectl apply -f nginx-hpa.yaml +``` + +You can see the status of the HPA as it starts up using `describe`: + +``` +kubectl describe hpa nginx-hpa +``` + +You might see some errors about the HPA being unable to retrieve metrics, however, these can be ignored since this is an issue that occurs only when the HPA starts up for the first time. \ No newline at end of file From e67d1cd60966e6a53c8b867d84a40a4b87f6f405 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 16 Jan 2024 11:44:33 +0530 Subject: [PATCH 13/17] autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index e4aff627..49fd2fd5 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -140,7 +140,15 @@ spec: averageUtilization: 80 ``` -TODO: ADD HPA description +The above HPA definition has a lot of similarities to the VPA definition. The differences lie in the minReplicas and maxReplicas sections which define the minimum and maximum number of pod replicas that the HPA should maintain. In this case, it's set to have a minimum of 2 replicas and a maximum of 5 replicas. The VPA didn't have a metrics section that the HPA has, but its resourcePolicy section is pretty similar to this, where the metrics configure the metric used for autoscaling. In this example, it's using the CPU utilization metric.`type: Resource:` Specifies that the metric is a resource metric (in this case, CPU). The `resource` section specifies the resource metric details. `name: cpu` Indicates that the metric is CPU utilization. + +target section: + +Specifies the target value for the metric. + +type: Utilization: Indicates that the target is based on resource utilization. + +averageUtilization: 80: Sets the target average CPU utilization to 80%. Before you deploy this file into your cluster, make sure to remove the VPA since having two types of autoscalers running for the same pod can cause some obvious problems. So first run: From 456bba2f7cf221ba6f7a6061e5f1540612960c8e Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Wed, 17 Jan 2024 11:44:52 +0530 Subject: [PATCH 14/17] autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index 49fd2fd5..9f5c77d8 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -140,15 +140,7 @@ spec: averageUtilization: 80 ``` -The above HPA definition has a lot of similarities to the VPA definition. The differences lie in the minReplicas and maxReplicas sections which define the minimum and maximum number of pod replicas that the HPA should maintain. In this case, it's set to have a minimum of 2 replicas and a maximum of 5 replicas. The VPA didn't have a metrics section that the HPA has, but its resourcePolicy section is pretty similar to this, where the metrics configure the metric used for autoscaling. In this example, it's using the CPU utilization metric.`type: Resource:` Specifies that the metric is a resource metric (in this case, CPU). The `resource` section specifies the resource metric details. `name: cpu` Indicates that the metric is CPU utilization. - -target section: - -Specifies the target value for the metric. - -type: Utilization: Indicates that the target is based on resource utilization. - -averageUtilization: 80: Sets the target average CPU utilization to 80%. +The above HPA definition has a lot of similarities to the VPA definition. The differences lie in the minReplicas and maxReplicas sections which define the minimum and maximum number of pod replicas that the HPA should maintain. In this case, it's set to have a minimum of 2 replicas and a maximum of 5 replicas. The VPA didn't have a metrics section that the HPA has, but its resourcePolicy section is pretty similar to this, where the metrics configure the metric used for autoscaling. In this example, it's using the CPU utilization metric.`type: Resource:` Specifies that the metric is a resource metric (in this case, CPU). The `resource` section specifies the resource metric details. `name: cpu` Indicates that the metric is CPU utilization. The target section specifies the target value for the metric and `type: Utilization` indicates that the target is based on resource utilization. `averageUtilization` sets the target average CPU utilization to 80%. Before you deploy this file into your cluster, make sure to remove the VPA since having two types of autoscalers running for the same pod can cause some obvious problems. So first run: @@ -168,4 +160,8 @@ You can see the status of the HPA as it starts up using `describe`: kubectl describe hpa nginx-hpa ``` -You might see some errors about the HPA being unable to retrieve metrics, however, these can be ignored since this is an issue that occurs only when the HPA starts up for the first time. \ No newline at end of file +You might see some errors about the HPA being unable to retrieve metrics, however, these can be ignored since this is an issue that occurs only when the HPA starts up for the first time. Now, let's go back to the apache benchmark and add load to the nginx service so that we can see the HPA in action. Let's start it up in the same manner as before: + +``` +ab -n 1000 -c 50 http:/// +``` \ No newline at end of file From 1fa5f17d063c322ae7f26c3cbcef622ba1290ac2 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 18 Jan 2024 14:29:11 +0530 Subject: [PATCH 15/17] autoscaler cont. --- Autoscaler101/autoscaler-lab.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index 9f5c77d8..c9311d0a 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -129,7 +129,7 @@ spec: apiVersion: apps/v1 kind: Deployment name: nginx-deployment - minReplicas: 2 + minReplicas: 1 maxReplicas: 5 metrics: - type: Resource @@ -164,4 +164,12 @@ You might see some errors about the HPA being unable to retrieve metrics, howeve ``` ab -n 1000 -c 50 http:/// -``` \ No newline at end of file +``` + +A thousand requests should start being sent to the service. Start watching the nginx pod to see if replicas are being created: + +``` +kubectl get po -n default --watch +``` + +You should be able to see the memory limit getting reached, after which the number of pods will increase. This will keep happening until the number of pods reaches the maximum specified value (5) or the memory requests are satisfied. \ No newline at end of file From c656f817d460b943f9d3af587be22bc24bab9c06 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 19 Jan 2024 16:00:16 +0530 Subject: [PATCH 16/17] autoscaler finished --- Autoscaler101/autoscaler-lab.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/autoscaler-lab.md b/Autoscaler101/autoscaler-lab.md index c9311d0a..af3eb697 100644 --- a/Autoscaler101/autoscaler-lab.md +++ b/Autoscaler101/autoscaler-lab.md @@ -172,4 +172,9 @@ A thousand requests should start being sent to the service. Start watching the n kubectl get po -n default --watch ``` -You should be able to see the memory limit getting reached, after which the number of pods will increase. This will keep happening until the number of pods reaches the maximum specified value (5) or the memory requests are satisfied. \ No newline at end of file +You should be able to see the memory limit getting reached, after which the number of pods will increase. This will keep happening until the number of pods reaches the maximum specified value (5) or the memory requests are satisfied. + + +## Conclusion + +That sums up the lab on autoscalers. In here, we discussed the two most commonly used in-built autoscalers: HPA and VPA. We also took a hands-on look at how the autoscalers worked. This is just the tip of the iceberg when it comes to scaling, however, and the subject of custom scalers that can scale based on metrics other than memory and CPU is vast. If you are interested in looking at more complicated scaling techniques, you could take a look at the [KEDA section](../Keda101/what-is-keda.md) to get some idea of the keda autoscaler. \ No newline at end of file From 2fe2dab7499396c9f2aa90e68497bb87e6125e2a Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sat, 20 Jan 2024 09:49:06 +0530 Subject: [PATCH 17/17] Finished autoscalers --- README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.md b/README.md index a4fef8e5..d91f18be 100644 --- a/README.md +++ b/README.md @@ -243,6 +243,11 @@ A Curated List of Kubernetes Labs and Tutorials - [Fluent Bit](./Logging101/fluentdbit.md) - [ELK on Kubernetes](./Logging101/elk-on-kubernetes.md) +## Autoscalers101 + + - [What are autoscalers](./Autoscaler101/what-are-autoscalers.md) + - [Autoscaler lab](./Autoscaler101/autoscaler-lab.md) + ## Helm101 - [What is Helm?](./Helm101/what-is-helm.md)