October 4, 2021
Autoscaling in Kubernetes

Autoscaling in Kubernetes

Kubernetes has taken the cloud-native landscape by storm. As a 2019 Sysdig study shows, Kubernetes orchestrated 77 percent of containers for the companies surveyed. It also controlled 89% of the share across Sysdig’s customer base. But what advantages make Kubernetes so popular and why should companies consider implementing this cloud-native platform as a core part of their infrastructure?

One of the primary benefits of Kubernetes is its autoscaling features. These features ensure your applications don’t use more cloud resources than they need. It also provides your applications with the correct amount of pods and nodes dedicated to them, so end-users experience optimal performance while you save on cloud-computing costs.

OVHcloud offers managed Kubernetes that takes full advantage of its benefits while freeing up your DevOps teams to focus on deploying software and applications. Whether you need to manage cloud costs or improve your infrastructure’s capacity and agility, we offer various cloud services that help your business grow.

What Is Kubernetes?

Kubernetes is an open-source, cloud-native container management platform that allows DevOps teams to easily merge code, automate the deployment of services and applications, automate operations, and automate the scaling of their containers. It uses microservices and a serverless architecture, which speeds up the time to market for software and applications.

Kubernetes also improves the containerization process in the following ways:

  • Autoscaling containers when applications need more RAM or CPU usage
  • Self-healing containers
  • More intuitive node to container fitting with predefined CPU and RAM requirements
  • Managed passwords with stored OAuth tokens and SSH keys
  • Load balancing and balanced network traffic distribution when requests become overloaded

What Is Autoscaling?

Kubernetes automatically scales your cluster up when you need more CPU and RAM. It also automatically scales down when you need less. The result is that you don’t use nodes in the cluster when you don’t need them, equating to increased savings.

To understand this better, consider the following example: you run a 24/7 service with variable loads throughout the day. During the day, it is busy in the US, but at night, it slows down considerably. The Autoscaling Cluster feature and the Horizontal Pod Autoscaler in Kubernetes adjust the load to meet the end-user demand. This combination results in significant cloud cost savings.

The Major Cost-Benefit of Autoscaling in Kubernetes

The most apparent benefits of Kubernetes are the resource and cost savings it provides. Without autoscaling, DevOps teams have to manually assign resources every time conditions change. Manually scaling options inevitably results in sub-optimal resource allocation and cloud spending.

With manual scaling, you pay for resources at peak capacity to ensure availability. Additionally, your services can fail during spikes because you can’t anticipate fluctuating needs. These types of crashes can cause substantial losses. Not only will you lose money to misallocated resources. You also have to account for the damages done to customer relationships and your brand reputation.

Kubernetes Autoscaling Methods

Kubernetes offers autoscaling capabilities at both the application abstraction level and the infrastructure layer. At the application abstraction level, autoscaling works in two ways, horizontal autoscaling, and vertical autoscaling.

Autoscaling at the Application Abstraction Level

Kubernetes uses both horizontal and vertical autoscaling to adjust the available resources for your containers. The Horizontal Pod Autoscaler (HPA) performs horizontal scaling while the Vertical Pod Autoscaler (VPA) performs vertical scaling.

What’s the Difference Between Horizontal Autoscaling and Vertical Autoscaling?

The difference between horizontal and vertical autoscaling is that horizontal autoscaling adjusts the number of pods and nodes while vertical autoscaling adjusts the CPU or RAM usage.

Horizontal Autoscaling

Horizontal autoscaling is a Kubernetes API resource and a controller. The HPA controller periodically analyzes metrics to adjust the number of replicas specified by the administrator. The default period for this assessment is every 15 seconds. However, you can customize the time from the control managers (horizontal-pod-autoscaler-sync-period flag).

The HPA applies as a control loop, meaning the control manager queries resource utilization against the target metrics in the Horizontal Pod Autoscaler definition. From this information, the manager decides what scaling measures to take. The HPA typically waits three minutes after scaling for metrics stabilization. After scaling down, it waits five minutes to avoid autoscaler thrashing (useless scaling in response to frequent metric fluctuations).

Vertical Autoscaling

Vertical autoscaling entails scaling CPU usage for your pods. The VPA is a relatively new tool in Kubernetes, and it utilizes shorter default periods of 10 seconds to measure metrics. The VPA mainly deals with stateful services and Out of Memory events, but you can also use it for auto-correcting the initial resource allocation of your pods.

To begin scaling with VPA, the pods need to restart. However, for the VPA to ensure you meet the minimum number of required pods, the VPA respects the Pods Distribution Budget (PDB).

The most important aspect of vertical scaling is that it defines the upper and lower resource limits your nodes can’t exceed. These limits ensure you use the optimal CPU processing at all times.

Infrastructure Autoscaling

Pod autoscaling is an efficient way to achieve high performance and minimize cloud-native costs. However, apps that require extensive usage also require an increase in clusters. In these situations, cluster autoscaling is the most effective autoscaling method.

Cluster Autoscaling

The Cluster Autoscaler manages the number of nodes within a cluster. It does so by keeping pods in a pending stage when a shortage occurs within a cluster. Because it’s cloud-native, the Kubernetes Cluster Autoscaler assesses the required nodes and interfaces with cloud providers.

Once the provider allocates new nodes, the Kubernetes scheduler connects the pending pods. Within 30 seconds, the CA identifies and confirms the pending pods. For scaling down, however, the CA waits ten minutes before reducing resources. Because of this, you should use cluster autoscaling sparingly. It affects your downward scaling flexibility and can result in misallocated resources and unnecessary cloud costs.

Kubernetes Autoscaling Best Practices

It’s critical to understand that you need to apply these auto-scaling methods in the right way to save money and resources. You shouldn’t simply turn them on and expect them to work their magic. They are tools that have specific uses.

Here are some of the most valuable ways to apply autoscaling to your applications:

  • Use the latest version of Kubernetes and the latest versions of HPA, VPA, and CA. Using the most recent version of both Kubernetes and the scaling controllers helps you avoid compatibility conflicts.
  • Currently, horizontal and vertical auto-scaling on the same set of pods is not compatible.
  • Kubernetes recommends using the CA with heterogeneous nodes because the CA expects nodes to have the same resource limits.
  • Specify resource requests to avoid misguided autoscaling.
  • Specify a PodDisruptionBudget so the CA can’t reduce nodes below your requirement.
  • The CA has a service limit of 1000 nodes, each with 30 pods. If you reach these levels, it can lead to a cluster sprawl, an occurrence that results in sub-optimal resource allocation.

Managed Kubernetes - The Additional Autoscaling Benefit

As you can see, autoscaling using cloud-native Kubernetes provides critical cost benefits for companies. However, using autoscaling incorrectly can cause companies to lose out on these savings. Using a managed Kubernetes provider such as OVHcloud takes the autoscaling burden off your DevOps team’s shoulders.

Using OVHcloud’s managed Kubernetes ensures you use the right amount of cloud resources at all times. We deploy autoscaling to your applications in an optimal way, maximizing your cloud-native savings.

Contact OVHcloud today to learn more about Kubernetes and its many benefits for your DevOps team. From improved productivity to increased application stability, performance, and deployment speed, Kubernetes is the wave of the future. OVHcloud can ensure your business takes full advantage of its benefits.

Ready to get Started?

Contact us