Welcome to the Linux Foundation Forum!

Cost Optimization With Kubernetes

Over the past two years at Magalix, we have focused on building our system, introducing new features, and scaling our infrastructure and microservices. During this time, we had a look at our Kubernetes clusters utilization and found it to be very low. We were paying for resources we didn’t use, so we started a cost-saving practice to increase cluster utilization, use the resources we already had and pay less to run our cluster.

In this article, I will discuss the top five techniques we used to better utilize our Kubernetes clusters on the cloud and eliminate wasted resources, thus saving money. In the end, we were able to cut our monthly bill by more than 50%!

  1. Applying Workload Right-Sizing
    Kubernetes manages and schedules pods are based on container resource specs:

Resource Requests: Kubernetes scheduler is used to place containers on the right node which has enough capacity
Resource Limits: Containers are NOT allowed to use more than their resource limit
Resources requests and limits are container-scooped specs, while multi-container pods define separate resource specs for each container:

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
namespace: magalix
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 1
memory: 1Gi
Kubernetes schedules pods based on resource requests and other restrictions without impairing availability. The scheduler uses CPU and memory resource requests to schedule the workloads in the right nodes, control which pod works on which node and if multiple pods can schedule together on a single node.

Every node type has its own allocatable CPU and memory capacities. Assigning high/unneeded CPU or memory resource requests can end up running underutilized pods on each node, which leads to underutilized nodes.

In this section, we compared resource requests, limited against actual usage and changed the resource request to something closer to the actual utilization while adding a little safety margin.

  1. Choosing The Right Worker Nodes
    Every Kubernetes cluster has its own special workload utilization. Some clusters use memory more than CPU (e.g: database and caching workloads), while others use CPU more than memory (e.g: user-interactive and batch-processing workloads)

Cloud providers such as GCP and AWS offer various node types that you can choose from.

Choosing the wrong node size for your cluster can end up costing you. For instance, choosing high CPU-to-memory ratio nodes for workloads that use memory extensively can starve for memory easily and trigger auto node scale-up, wasting more CPUs that we don’t need.

Calculating the right ratio of CPU-to-memory isn’t easy; you will need to monitor and know your workloads well.

For example, GCP offers general purpose, compute-optimized, memory-optimized with various CPU and memory count and ratios:

Just keep in mind that 1 vCPU is way more expensive than 1GB memory. I have enough memory in the clusters I manage so I try to make sure that when there is a pending pod, this pod is pending on CPUs (which is expensive) so the autoscaler triggers a scale-up for the new node.

To see the cost difference between CPU and memory, let us look at the GCP N2 machine price. GCP gives you the freedom to choose a custom machine type:

(# vCPU x 1vCPU price) + (# GB memory x 1GB memory price)
It’s clear here that the 1vCPU costs 7.44 times more than the cost of 1GB.

To Read More: https://hubs.ly/H0zwcrJ0

Comments

  • ahannanec
    ahannanec Posts: 1

    Optimizing Kubernetes costs is a common challenge, but with the right strategies, you can significantly reduce cloud spending without sacrificing performance or reliability. Below is a structured approach based on real-world implementations.

    Key Strategies for Kubernetes Cost Optimization

    • Right-size your resource requests and limits – Many teams over-provision CPU and memory “just in case.” Analyze historical usage with tools like the Vertical Pod Autoscaler (VPA) or open-source metrics (Prometheus + Grafana) to set accurate requests and limits. This alone can cut waste by 30–50%.

    • Implement multi-level autoscaling – Combine the Horizontal Pod Autoscaler (HPA) to scale pod replicas based on demand with a Cluster Autoscaler to add or remove nodes. For more advanced just-in-time provisioning, consider Karpenter, which selects optimal instance types and reduces idle node costs.

    • Leverage spot instances for fault-tolerant workloads – Spot instances (or preemptible VMs) offer 60–90% discounts. Use them for batch jobs, stateless services, or development environments. Always pair with node affinity/anti-affinity rules to avoid critical workloads being evicted.

    • Eliminate zombie and unused resources – Orphaned Persistent Volumes, load balancers, or unused namespaces can linger unnoticed. Regularly audit your cluster with tools like kubectl or commercial scanners to clean up idle resources.

    • Optimize cross-zone networking costs – In many cloud providers, data transfer between availability zones incurs charges. Keep pod-to-pod traffic within the same zone when possible, or use topology spread constraints to balance replicas without excessive cross-zone communication.

    • Implement FinOps with granular cost visibility – Use Kubecost or OpenCost to allocate costs per namespace, label, or even pod. Set up budget alerts and namespace resource quotas to prevent runaway spending. Show developers their real-time cloud bill per feature or team.

    Recommended Workflow for Sustained Savings

    1. Measure – Deploy a cost monitoring tool (e.g., Kubecost) to establish a baseline.
    2. Analyze – Identify top spenders (namespaces, workloads, nodes).
    3. Optimize – Apply rightsizing, autoscaling, and spot instances iteratively.
    4. Govern – Enforce resource quotas and budget alerts.
    5. Repeat – Review monthly as workloads evolve.

    How EaseCloud Can Help

    Implementing these strategies in-house can be time‑consuming. At EaseCloud, our Cloud Cost Optimization service provides end-to-end Kubernetes cost management – from rightsizing and autoscaling configuration to FinOps dashboard setup. We also offer specialized Kubernetes Consulting to help you containerize and orchestrate with maximum efficiency.

Categories

Upcoming Training