Welcome to the Linux Foundation Forum!

Compute node reboot cadence

There was a discussion that I was having with co-workers the other day. These co-workers were adamant that k8s nodes should be rebooted every quarter at minimum in order to alleviate application memory leaks amongst other things. Their argument was that applications including the cgroups on the Linux server would continuously have memory leaks an the only way to clean the compute nodes were to reboot them. The question that I put forth that no one could answer was: Shouldn't the memory leak on the application be taken care of by the developers? Also, if the cgroups were such a problem that the Linux OS needed to be rebooted because of them. Why haven't I read any bulletins pointing out such a case?

So I figured I would pose the question here as to whether this is a needed process for a k8s cluster or overkill? Background on the environment. k8s cluster running CentOS 7.9 with v1.18.17.

Comments

  • chrispokorni
    chrispokorni Posts: 2,372

    Hi @jmik618,

    For earlier Kubernetes releases in combination with some very specific application deployments there are a few blog posts and forum threads discussing cluster agents' and/or applications' memory leaks. However, they were describing very specific conditions, which cannot be generalized.

    If cluster node reboots are desired, such reboots should be performed in a rolling fashion to minimize and/or eliminate any negative impact to the workload deployed to the cluster. And if control plane reboots are also planned, the control plane should be configured for HA.

    Regards,
    -Chris

Categories

Upcoming Training