Lab 16.2 HA multiple master nodes cluster when using cri-o steps??

josephkwong · November 2021

Hi Chris,

I've installed and configured successfully in accordance with steps listed on Lab 3.1 for 1 master and 1 worker node using cri-o container runtime to deploy.

However at Ex 16.1 HA Steps I gone through, I am aware of a 'Very Important' note about using cri-o container runtime situation required to create a NEW CLUSTER and pass DIFFERENT OPTIONS to kubeadm init. Now, I am trying to create a new cluster with 3 new master nodes where HA Proxy previously created is to be a load balancer for 3 master nodes as per this Lab scenario.

For such, I'd like to have your comments if the following steps are correct to setup a new cluster with HA when using cri-o. If not, please provide respective steps and commands for corrections.

For the first master(cp), the second master(cp2), the third master(cp3), all worker nodes.
Follow the LAB_16.2 Detailed Steps - Install Software
step 1. commands ->
step 2. commands ->
step 3. (b) IF you chose cri-o for the cp and worker ->
(which is to follow LAB_3.1 step 5.(b) using CRI-O commands)
step 4. all commands
For the first master(cp)
Follow the LAB_3.1
steps 6. - 11.
step 13.
root@cp:~# vim /etc/hosts
-ha-proxy k8scp
-cp2
-cp3
-wk
...
step 14. - IF USING CRI-O
root@cp:~# find /home -name kubeadm-crio.yaml
root@cp:~# nano /kubeadm-crio.yaml
nodeRegistration:
criSocket: unix:///var/run/crio/crio.sock
name: k8scp
kind: ClusterConfiguration
kubernetesVersion: 1.21.1
root@cp:~# cp .
step 15. initialze the cp.
root@cp:~# kubeadm init --config=kubeadm-config.yaml --upload-certs \
| tee kubeadm-init.out
[init] Using Kubernetes version: v1.21.1
[preflight] Running pre-flight checks
```
<output_omitted>
```
step 16. - 17.
For the second master(cp2), the third master(cp3)
Follow the LAB_16.2 Detailed Steps - Join Control Plane Nodes
step 1.
root@cp2/3:~# vim /etc/hosts
-ha-proxy k8scp
-cp2
-cp3
-wk
step 2. or 3.-to-5.
step 6.
root@cp2/3 $ sudo kubeadm join k8scp:6443 \
--token xxxx.. --discovery-token-ca-cert-hash sha256:xxxx.. \
--control-plane --certificate-key xxxx..
step 7.

Thanks again and have a blessed day,
Joseph

chrispokorni · November 2021

Hi @josephkwong,

Both chapters 3 and 16 provide detailed steps to set up the first control-plane node, the worker, then the additional two control-plane nodes and the haproxy. The cri-o installation steps can be found in chapter 3 as well, and referenced whenever necessary for the worker, second and third control-plane nodes. As cri-o installation is much more complex than docker installation, extra care must be taken to ensure all install and config steps are properly executed.

Regards,
-Chris

josephkwong · November 2021

Thanks Chris. Herein the outcome:

The first master node setup is okay and running. The output is attached.
It FAILED for the second master node 'cp2' to Join Control Plane Nodes. The output is attached.

Please your comments.

Have a great day,
Joseph

chrispokorni · November 2021

Hi @josephkwong,

What podNetwork is used by Calico and the kubead-crio.yaml config file?
What IP range is used by the hypervisor when provisioning VMs?

If the podSubnet overlaps the IP range of the VMs, cluster internal traffic will be impacted, causing communication issues between nodes, control-plane pods and client workload pods.

Regards,
-Chris

josephkwong · November 2021

Hi Chris,

The private IP range used by the hypervisor is 192.168.3.1 to 192.168.3.253. The first master 'cp' node IP is 192.168.3.67.

Welcome to Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-89-generic x86_64)

Documentation: https://help.ubuntu.com
Management: https://landscape.canonical.com
Support: https://ubuntu.com/advantage

System information as of Sat Nov 13 08:03:52 UTC 2021

System load: 0.24 Processes: 252
Usage of /: 17.5% of 30.88GB Users logged in: 0
Memory usage: 16% IPv4 address for ens160: 192.168.3.67
Swap usage: 0% IPv4 address for tunl0: 192.168.74.128

jlab@cp:~$ ip a | grep inet
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
inet 192.168.3.67/24 brd 192.168.3.255 scope global ens160
inet6 fe80::250:56ff:fe02:39b/64 scope link
inet 192.168.74.128/32 scope global tunl0
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
inet6 fe80::ecee:eeff:feee:eeee/64 scope link

Calicol:
jlab@cp:/$ kubectl calico get ippools -o wide
NAME CIDR NAT IPIPMODE VXLANMODE DISABLED DISABLEBGPEXPORT SELECTOR
default-ipv4-ippool 192.168.0.0/16 true Always Never false false all()

jlab@cp:~$ kubectl get all -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/calico-kube-controllers-5d995d45d6-nspnl 1/1 Running 0 39h 192.168.74.129 k8scp
pod/calico-node-gcvds 1/1 Running 0 39h 192.168.3.67 k8scp
pod/coredns-558bd4d5db-bs8qb 1/1 Running 0 40h 192.168.74.131 k8scp
pod/coredns-558bd4d5db-z75dj 1/1 Running 0 40h 192.168.74.130 k8scp
pod/etcd-k8scp 1/1 Running 0 40h 192.168.3.67 k8scp
pod/kube-apiserver-k8scp 1/1 Running 0 40h 192.168.3.67 k8scp
pod/kube-controller-manager-k8scp 1/1 Running 0 40h 192.168.3.67 k8scp
pod/kube-proxy-m8g76 1/1 Running 0 40h 192.168.3.67 k8scp
pod/kube-scheduler-k8scp 1/1 Running 0 40h 192.168.3.67 k8scp

Please attached relevant config and output files.
1. yaml - calico, kubeadm-crio
2. output file - kubeadm-init.out
3. config file - calico-kubeconfig, 100-crio-bridge.conf
4. journalctl file - crio-journalctl.log
5. dns alias host file - hosts

Thanks again and have a blessed weekend.
Joseph

chrispokorni · November 2021

Hi @josephkwong,

As I suspected, your network subnets overlap, causing major issues with your cluster.
First I would recommend re-building your cluster with distinct networks for nodes and pods - try 10.200.0.0/16 as VM/node subnet managed by the hypervisor's DHCP server.

Second, I noticed the kubeadm join command uses an IP address, as the advertised control-plane endpoint. That makes the HA config impossible to achieve. In order to fix it, edit the kubeadm-crio.yaml file as follows:

1 - for the InitConfiguration resource update the nodeRegistration.name to be cp or the HOSTNAME of your control-plane node.
2 - for the ClusterConfiguration resource add a new property controlPlaneEndpoint: "k8scp:6443" and double-check the desired value of kubernetesVersion:. After the init completion, the suggested join command should display kubeadm join k8scp:6443 --token abcdef...

As you initialize the cluster and join the first worker node, edit the /etc/hosts files as such:
a) on the control plane node add its Private IP with k8scp alias.
Example: 10.200.0.5 k8scp
b) on the worker node add the Private IP of the control plane node with k8scp alias, same entry as (a).
Example: 10.200.0.5 k8scp

Later, for HA configuration these aliases will be replaced with the ha-proxy Private IP, as instructed in the lab guide.

apiVersion: kubeadm.k8s.io/v1beta2
...
kind: InitConfiguration
localAPIEndpoint:
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/crio/crio.sock
  name: k8scp              # <-- EDIT this line per instructions above
  taints: null
---
...
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
controlPlaneEndpoint: "k8scp:6443"        # <-- ADD this line
...
kind: ClusterConfiguration
kubernetesVersion: 1.22.1           # < -- EDIT to 1.21.1 if planning a cluster UPGRADE after init + join
networking:
...
<leave unchanged the rest of the file>

Regards,
-Chris

maybel · July 2023

Hi @chrispokorni, my first question on this exercise is: Will the three new nodes that I have to create use the same network (lfclass) I used for my cp and worker node?

chrispokorni · July 2023

Hi @maybel,

Yes, the three new instances (2 control plane nodes and 1 haproxy instance) are on the same network as the first two instances (1 control plane and 1 worker nodes).

Regards,
-Chris

Lab 16.2 HA multiple master nodes cluster when using cri-o steps??

Answers

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)