How not to sacrifice user experience in pursuit of Kubernetes cost optimization

October 6, 2023

By

Google Cloud Blog

The success of any application depends on the user experience it provides. It’s no secret that a positive user experience is directly linked to customer satisfaction and long-term engagement. Cost optimization is important to an application’s success too, of course. In pursuing cost optimization, Kubernetes administrators and application developers attempt to balance reliability and cost. But according to our recent State of Kubernetes Cost Optimization report, user experience can be compromised when cost optimization efforts overlook reliability. This emphasizes the critical need to prioritize user experience while considering cost-optimization measures.

End user experience can be compromised when cost optimization efforts don’t consider reliability State of Kubernetes Cost Optimization report

In previous blogs, we shed light on the foundational steps required to optimize applications, including setting resource requests, and rightsizing workloads. In this blog, we’ll explore how Kubernetes scheduling can lead to decisions that compromise user experience, especially when dealing with Pods with BestEffort or Burstable Quality of Service classes. By the end of this blog, you will have a better understanding of the key factors that impact user experience and how to prioritize them.

Many platform admins look for ways to lower costs when dealing with Kubernetes scheduling. Acting on insufficient information and aggressively scaling down nodes due to bin packing metrics (underutilized nodes) are two of the typical motions. However, the State of Kubernetes Cost Optimization report shows that clusters with a high number of BestEffort pods typically exhibit low bin-packing. This appears as an opportunity to scale down the cluster. But doing so results in BestEffort pods being terminated more frequently, which can detrimentally affect the user experience.

It’s essential to recognize that user satisfaction is at the heart of any successful application. Consequently, any cost optimization strategy that neglects the needs of end users is highly likely to encounter challenges, emphasizing the importance of prioritizing reliability in all cost optimization decisions.

The unpredictability factor, and its impact on customer experience

Both BestEffort and Burstable pods offer resource-management flexibility. However, they come with their own set of challenges, especially when BestEffort pods and Burstable Pods aren’t configured correctly. Kubernetes doesn’t explicitly alert users when Pods are abruptly terminated due to node pressure, since it automatically creates new Pods to replace them. This behavior, while ensuring continuity, may obscure underlying issues, necessitating proactive monitoring.

BestEffort Pods have the following characteristics:

Due to the lack of explicit resource requests and limits, they have the highest chance of being abruptly killed in case of a node-pressure process happening in the node.
Their unpredictability can harm your user experience.

Burstable Pods, meanwhile:

Risk being abruptly killed when they consistently consume beyond their specified requests
Frequently exceed declared resource requests by burstable pods, which can lead to resource starvation, adversely affecting workloads, especially those with lower QoS classes like BestEffort.
Are unpredictable, which can harm your users’ experience.

This is why it’s critical to set resource requests for workloads that have a minimum level of required reliability. Click here to see how many workloads in your cluster are not setting requests.

Estimating resources for smooth user experiences

Estimating the precise resources an application requires can be a challenging endeavor, especially given the dynamic nature of modern applications. However, there are tools and features to alleviate this guessing game. Google Kubernetes Engine (GKE) offers built-in Vertical Pod Autoscaler (VPA) recommendations for workloads right within its UI (even if you do not deploy VPA resources for your workload). This, coupled with Cloud Monitoring, provides valuable insights into the actual resource usage of your applications. Moreover, if you wish to have a more structured approach, there’s a guide on exporting workload recommendations as detailed in the workload rightsizing blog. After establishing and setting resource requests for all your workloads, you can further refine and scale the environment using tools such as Horizontal Pod Autoscaler, which automatically adjusts the number of instances required based on usage patterns, and Cluster Autoscaler, which scales cluster nodes as required. Both tools improve resource-allocation management while prioritizing reliability.

Setting up monitoring and alerting is another essential aspect of maintaining high-quality end-user experiences. Continuously observing your application’s performance, resource usage, BestEffort workloads and Burstable workloads using the above requested resources can provide invaluable insights that administrators can act upon proactively, before issues escalate. By setting up strategic alerts based on predefined thresholds or unusual patterns, teams can be immediately notified of potential problems, ensuring swift resolution. GKE Interactive Playbooks offer pre-built views to help monitor for Crashlooping Pods and Unschedulable Pods. This proactive approach, grounded in effective monitoring and alerting practices, ensures that any disruptions to the end-user experience are minimized, if not entirely prevented. In essence, consistent monitoring and timely alerts act as the first line of defense in safeguarding the experience that users expect.

Kubernetes cost optimization should not be a trade-off between reliability and cost. As highlighted in this blog post, it’s essential to prioritize user experience when making any cost-optimization decision. Nor should Kubernetes platform admins overlook reliability in pursuit of cost optimization. To strike the right balance, workloads must be rightsized, and a combination of well-optimized resource allocation and reliability features must be used. By doing so, admins can ensure that cost-optimization measures are robust and sustainable, and prioritize the user experience for long-term engagement and growth.

Remember, before attempting to scale down your cluster, it’s essential to set appropriate resource requests. Also be sure to download the State of Kubernetes Cost Optimization report, review the key findings, and stay tuned for our next blog post!

How not to sacrifice user experience in pursuit of Kubernetes cost optimization

The unpredictability factor, and its impact on customer experience

Estimating resources for smooth user experiences

Further reading