LFS258 Lab11.1 and Lab11.2 Installations and disk-pressure on local environment

mfriedemann · April 2024

Hello,

since last week i am having multiple problems with the installation of linkerd (-viz) and the ingress. While the installation now works (after adding more virtual space thanks to the recommendation from https://forum.linuxfoundation.org/discussion/864045/recommendation-on-20gb-per-node-vm-disk-space-to-avoid-evicted-pod-death-spiral?utm_source=community-search&utm_medium=organic-search&utm_term=lab+11.1+disk+prss ) and i could work with everything up until Lab11.2.10.
For context the stats of my local machine and how the setup in VirtualBox for the nodes has been changed a bit from the recommend stats for the nodes (cp node has now a bit more memory and both nodes got additional 20GB flexible virtual drives added; done after the load already got too high previously. Though it seems those are not really beeing used as the actual size of them did not increase yet):

I noticed repeating problems arriving with to many pods running and/or being evicted and stuck in pending:

!

the common theme was of course disk-pressure.

I am not sure how to really solve this as the 8GB memory of my local machine is already over limit and swap rises up to 1GB (swap only on my local machine, the nodes are of course set to swapoff -a and the /etc/fstab has the swapfile-line commented out).

I would be thankful for any kind of feedback. Maybe the only way around this is using GCE but i think it is understandable if i would not like to recreate all the labs.

chrispokorni · April 2024

Hi @mfriedemann,

The ready-for.sh script is not quite relevant to the LFS258 class.

As far as the disk size goes, dynamically allocated disk space is not favored by kubelet, and a single disk size of 10 GB may no longer be sufficient to run Kubernetes and the necessary plugins for this training. The 20GB+ disk space recommendation from the discussion linked above is correct, but what it did not clearly specify is how to achieve that. From personal experience I agree with the 20GB+ recommendation (at times I go up to 50GB per VM), but I attach a single disk of 20GB+ per VM, rather than having multiple 10GB disks stacked to achieve the required size.

Hope this helps you to move past this issue.

Regards,
-Chris

mfriedemann · April 2024

Thanks for the quick response. As my own machine only has a 100GB harddrive, i might be difficult to get nodes of that size while more cpu and memory might also be needed. But it is worth a try.

If not...onwards to GCE i guess.
Is there a list of "to-do"s from earlier labs to get back to Lab 11 quicker? As i noticed some Labs do require pods,replicasets and namespaces created in former Labs. That one will have to install software and redo the whole cluster-update from Lab4.1 from some of those previous labs as well is a given.

chrispokorni · April 2024

Hi @mfriedemann,

For a local cluster with one control plane and one worker node you'd need the 20GB and 20GB disk respectively, 2 CPUs and 2 CPUs, 8 GB RAM (cp) and 4-8 GB RAM (worker). If this is not possible due to physical hardware limitation, then the GCE option would be quite inexpensive. You can stop the GCE VMs between sessions to save on costs - the cluster will reconnect once the VMs are started.

What you will need for sure to quickly get to Lab 11 are:
- Lab 3.1 Install Kubernetes
- Lab 3.2 Grow the cluster
- Lab 3.3 Finish cluster setup
- Lab 4.1 Basic node maintenance - Upgrade the cluster (optional). If you keep working on k8s v1.28.1 all the way to lab 11 you should be fine. If you figured out how to install v1.29.1 in Labs 3.1 and 3.2, instead of v1.28.1 -- all the better
- Lab 9.1 Deploy a new service - you need the accounting namespace, the nginx-one deployment with the corrected containerPort: 80 in the nginx-one.yaml definition manifest, the worker node labeled
- Lab 9.2 Configure a nodeport - you need the service-lab nodeport service
- Lab 10 - only the commands from steps 1, 2, 3 to install Helm

Overall, I think these would be the steps that prepare the cluster for Lab 11. I apologize in advance if I am missing anything else that may be needed.

To ensure the GCE, VPC, firewall configs are correct please follow the GCE setup demo video from the introductory chapter.

Good luck!

Regards,
-Chris

mfriedemann · April 2024

Hi Chris ( @chrispokorni ),

Thank you so much for that list!
I tried to resize my local virtual disk and partition but that proved more difficult than i thought("gparted" not beeing available on terminal and starting the vm while not having the drive in use when using "parted"... well) and the total 8GB RAM of my machine still beeing the bottleneck it just is.

I think i will try to redo everything necessary on GCE the next few days and this will help a lot!
I had a few bumps in my work through the labs but those were more of a kind fitting for using everything local (such as NodePort not working from the get go or ifconfig never giving the right ip but by using ssh(putty) to log on i already knew the correct one and so on...).
Setting everything up might take a while but i did not have any problems with updating as far as i remember. Fortunately i already am used to redoing Lab9 as not everything was working well with linkerd and uninstalling wasn't possible. It was good to have some snapshots of my nodes from around Lab7 available.
But glad to know that i don't have to redo everything.
Thanks again!

Best regards,
Marvin (mfriedemann)

LFS258 Lab11.1 and Lab11.2 Installations and disk-pressure on local environment

Welcome!

Comments

Welcome!

Welcome!

Quick Links

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)