Kubernetes v1.36: In-Place Vertical Scaling for Pod-Level Resources Graduates to Beta
The Kubernetes community has reached another milestone with v1.36: In-Place Pod-Level Resources Vertical Scaling is now graduating to Beta status and enabled by default. If you’ve been following the feature’s journey through earlier versions, this represents a meaningful step toward stability. For those new to the concept, this feature addresses one of Kubernetes’ long-standing operational challenges—adjusting CPU and memory requests and limits for running pods without forcing a restart. Previously, if you realized a pod needed more resources, you had to terminate it and redeploy it with new specifications, causing service interruption. In-place vertical scaling eliminates that painful workflow.
Here’s how the feature works under the hood. When you update a pod’s resource requests or limits, the kubelet detects the change and applies it to the running container without restarting the pod itself. The key technical challenge is managing this gracefully: the kubelet must coordinate with the container runtime to resize allocated resources while the workload continues running. For CPU, the resize happens through cgroup updates. Memory adjustments require more care—the kubelet may need to halt the container briefly to adjust memory limits, but the pod’s identity and IP address remain intact. The control plane tracks these changes and ensures the scheduler understands the pod’s new resource footprint for future placement decisions. This is fundamentally different from horizontal scaling (adding more pods); it’s about right-sizing individual pod capacity on the fly.
The practical benefits become obvious when you consider real-world scenarios. Imagine a data processing pipeline where you underestimated memory requirements for a batch job. Instead of killing the job halfway through and restarting from scratch, you can now increase memory while it runs. Similarly, if you’re running interactive services and notice CPU contention, you can bump up CPU allocations during peak hours without service disruption. For teams running cost optimization initiatives, this feature enables automatic right-sizing workflows—your monitoring system can detect that a pod consistently uses only 30% of allocated resources and shrink it, freeing capacity for other workloads.
The graduation to Beta with default enablement matters because it signals stability and production-readiness. If you’re running Kubernetes v1.36, you likely have this feature available already. We recommend testing it in non-critical environments first—monitor how your workloads respond to resource adjustments and validate that your monitoring dashboards accurately reflect the changes. This is especially valuable if you’re automating resource management with tools like Kubelet Resource Manager or building custom controllers that respond to utilization metrics. As the feature matures toward GA, expect it to become a standard tool in your Kubernetes operational toolkit.