• Kubernetes v1.35: New level of efficiency with in-place Pod restart The release of Kubernetes 1.35 introduces a powerful new feature that provides a much-requested capability: the ability to trigger a full, in-place restart of the Pod. • This feature, Restart All Containers (alpha in 1.35), allows for an efficient way to reset a Pod’s state compared to resource-intensive approach of deleting and recreating the entire Pod. • This feature is especially useful for AI/ML workloads allowing application developers to concentrate on their core training logic while offloading complex failure-handling and recovery mechanisms to sidecars and declarative Kubernetes configuration. • With RestartAllContainers and other planned enhancements, Kubernetes continues to add building blocks for creating the most flexible, robust, and efficient platforms for AI/ML workloads. • This new functionality is available by enabling the RestartAllContainersOnContainerExits feature gate. • This alpha feature extends the Container Restart Rules feature, which graduated to beta in Kubernetes 1.35.

Article Summaries:

  • Kubernetes v1.35 introduces an alpha feature that lets users trigger a full in‑place restart of all containers in a Pod via the RestartAllContainersOnContainerExits feature gate. This replaces the costly delete‑and‑recreate workflow, saving scheduler, node allocation, and networking overhead. The change is especially valuable for AI/ML workloads, where thousands of Pods may need to reset after a node failure, reducing resource waste and operational complexity. The feature builds on the recently graduated Container Restart Rules (now beta) and aims to provide a native, efficient recovery path for multi‑container applications.

Sources: