Welcome to the Linux Foundation Forum!

Lab 16.2 HA multiple master nodes cluster when using cri-o steps??

Hi Chris,

I've installed and configured successfully in accordance with steps listed on Lab 3.1 for 1 master and 1 worker node using cri-o container runtime to deploy.

However at Ex 16.1 HA Steps I gone through, I am aware of a 'Very Important' note about using cri-o container runtime situation required to create a NEW CLUSTER and pass DIFFERENT OPTIONS to kubeadm init. Now, I am trying to create a new cluster with 3 new master nodes where HA Proxy previously created is to be a load balancer for 3 master nodes as per this Lab scenario.

For such, I'd like to have your comments if the following steps are correct to setup a new cluster with HA when using cri-o. If not, please provide respective steps and commands for corrections.

  1. For the first master(cp), the second master(cp2), the third master(cp3), all worker nodes.
    Follow the LAB_16.2 Detailed Steps - Install Software
    step 1. commands ->
    step 2. commands ->
    step 3. (b) IF you chose cri-o for the cp and worker ->
    (which is to follow LAB_3.1 step 5.(b) using CRI-O commands)
    step 4. all commands

  2. For the first master(cp)
    Follow the LAB_3.1
    steps 6. - 11.
    step 13.
    root@cp:~# vim /etc/hosts
    -ha-proxy k8scp
    -cp2
    -cp3
    -wk
    ...
    step 14. - IF USING CRI-O
    root@cp:~# find /home -name kubeadm-crio.yaml
    root@cp:~# nano /kubeadm-crio.yaml
    nodeRegistration:
    criSocket: unix:///var/run/crio/crio.sock
    name: k8scp
    kind: ClusterConfiguration
    kubernetesVersion: 1.21.1
    root@cp:~# cp .
    step 15. initialze the cp.
    root@cp:~# kubeadm init --config=kubeadm-config.yaml --upload-certs \
    | tee kubeadm-init.out
    [init] Using Kubernetes version: v1.21.1
    [preflight] Running pre-flight checks

    <output_omitted>
    

    step 16. - 17.

  3. For the second master(cp2), the third master(cp3)
    Follow the LAB_16.2 Detailed Steps - Join Control Plane Nodes
    step 1.
    root@cp2/3:~# vim /etc/hosts
    -ha-proxy k8scp
    -cp2
    -cp3
    -wk
    step 2. or 3.-to-5.
    step 6.
    root@cp2/3 $ sudo kubeadm join k8scp:6443 \
    --token xxxx.. --discovery-token-ca-cert-hash sha256:xxxx.. \
    --control-plane --certificate-key xxxx..
    step 7.

Thanks again and have a blessed day,
Joseph

Answers

  • Hi @josephkwong,

    Both chapters 3 and 16 provide detailed steps to set up the first control-plane node, the worker, then the additional two control-plane nodes and the haproxy. The cri-o installation steps can be found in chapter 3 as well, and referenced whenever necessary for the worker, second and third control-plane nodes. As cri-o installation is much more complex than docker installation, extra care must be taken to ensure all install and config steps are properly executed.

    Regards,
    -Chris

  • Thanks Chris. Herein the outcome:

    1. The first master node setup is okay and running. The output is attached.

    2. It FAILED for the second master node 'cp2' to Join Control Plane Nodes. The output is attached.

    Please your comments.

    Have a great day,
    Joseph

  • Hi @josephkwong,

    What podNetwork is used by Calico and the kubead-crio.yaml config file?
    What IP range is used by the hypervisor when provisioning VMs?

    If the podSubnet overlaps the IP range of the VMs, cluster internal traffic will be impacted, causing communication issues between nodes, control-plane pods and client workload pods.

    Regards,
    -Chris

  • Hi Chris,

    The private IP range used by the hypervisor is 192.168.3.1 to 192.168.3.253. The first master 'cp' node IP is 192.168.3.67.

    Welcome to Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-89-generic x86_64)

    jlab@cp:~$ ip a | grep inet
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
    inet 192.168.3.67/24 brd 192.168.3.255 scope global ens160
    inet6 fe80::250:56ff:fe02:39b/64 scope link
    inet 192.168.74.128/32 scope global tunl0
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link

    Calicol:
    jlab@cp:/$ kubectl calico get ippools -o wide
    NAME CIDR NAT IPIPMODE VXLANMODE DISABLED DISABLEBGPEXPORT SELECTOR
    default-ipv4-ippool 192.168.0.0/16 true Always Never false false all()

    jlab@cp:~$ kubectl get all -n kube-system -o wide
    NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    pod/calico-kube-controllers-5d995d45d6-nspnl 1/1 Running 0 39h 192.168.74.129 k8scp
    pod/calico-node-gcvds 1/1 Running 0 39h 192.168.3.67 k8scp
    pod/coredns-558bd4d5db-bs8qb 1/1 Running 0 40h 192.168.74.131 k8scp
    pod/coredns-558bd4d5db-z75dj 1/1 Running 0 40h 192.168.74.130 k8scp
    pod/etcd-k8scp 1/1 Running 0 40h 192.168.3.67 k8scp
    pod/kube-apiserver-k8scp 1/1 Running 0 40h 192.168.3.67 k8scp
    pod/kube-controller-manager-k8scp 1/1 Running 0 40h 192.168.3.67 k8scp
    pod/kube-proxy-m8g76 1/1 Running 0 40h 192.168.3.67 k8scp
    pod/kube-scheduler-k8scp 1/1 Running 0 40h 192.168.3.67 k8scp

    Please attached relevant config and output files.
    1. yaml - calico, kubeadm-crio
    2. output file - kubeadm-init.out
    3. config file - calico-kubeconfig, 100-crio-bridge.conf
    4. journalctl file - crio-journalctl.log
    5. dns alias host file - hosts

    Thanks again and have a blessed weekend.
    Joseph

  • Hi @josephkwong,

    As I suspected, your network subnets overlap, causing major issues with your cluster.
    First I would recommend re-building your cluster with distinct networks for nodes and pods - try 10.200.0.0/16 as VM/node subnet managed by the hypervisor's DHCP server.

    Second, I noticed the kubeadm join command uses an IP address, as the advertised control-plane endpoint. That makes the HA config impossible to achieve. In order to fix it, edit the kubeadm-crio.yaml file as follows:

    1 - for the InitConfiguration resource update the nodeRegistration.name to be cp or the HOSTNAME of your control-plane node.
    2 - for the ClusterConfiguration resource add a new property controlPlaneEndpoint: "k8scp:6443" and double-check the desired value of kubernetesVersion:. After the init completion, the suggested join command should display kubeadm join k8scp:6443 --token abcdef...

    As you initialize the cluster and join the first worker node, edit the /etc/hosts files as such:
    a) on the control plane node add its Private IP with k8scp alias.
    Example: 10.200.0.5 k8scp
    b) on the worker node add the Private IP of the control plane node with k8scp alias, same entry as (a).
    Example: 10.200.0.5 k8scp

    Later, for HA configuration these aliases will be replaced with the ha-proxy Private IP, as instructed in the lab guide.

    apiVersion: kubeadm.k8s.io/v1beta2
    ...
    kind: InitConfiguration
    localAPIEndpoint:
      bindPort: 6443
    nodeRegistration:
      criSocket: unix:///var/run/crio/crio.sock
      name: k8scp              # <-- EDIT this line per instructions above
      taints: null
    ---
    ...
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controllerManager: {}
    controlPlaneEndpoint: "k8scp:6443"        # <-- ADD this line
    ...
    kind: ClusterConfiguration
    kubernetesVersion: 1.22.1           # < -- EDIT to 1.21.1 if planning a cluster UPGRADE after init + join
    networking:
    ...
    <leave unchanged the rest of the file>
    

    Regards,
    -Chris

  • maybel
    maybel Posts: 45

    Hi @chrispokorni, my first question on this exercise is: Will the three new nodes that I have to create use the same network (lfclass) I used for my cp and worker node?

  • chrispokorni
    chrispokorni Posts: 2,155

    Hi @maybel,

    Yes, the three new instances (2 control plane nodes and 1 haproxy instance) are on the same network as the first two instances (1 control plane and 1 worker nodes).

    Regards,
    -Chris

Categories

Upcoming Training