Welcome to the Linux Foundation Forum!

Installing cilium disrupts network on the Control Plane VM

Options

Hi,

So as part of my on going issue which I have posted here: https://forum.linuxfoundation.org/discussion/comment/38465

I decided to destroy my current VMs (which were on Ubuntu 22.04) and start up new ones with 20.04.

So this meant re-installing the K8s cluster as per Lab2 exercise 2.2 onwards. I also downloaded the latest training files.

I noticed in the k8scp.sh file the network plugin install has changed from Calicao to Cilium.

So, I've noticed after I install Cilium the network access to the VM is broken, i.e. the WorkerNode is not able to join the cluster/connect to the Control Plane, I can't telnet to the Control Plane either.

Even SSH stops working - I get kicked out as soon as Cilium is installed, I can still login using vagrant ssh command but before I could ssh from the terminal on 10.0.0.10. I could also ssh from WorkNode as well.

Has anyone else experienced this issue or is it an issue with the latest Lab download? I am using "LFD259_V2023-09-05_SOLUTIONS"

I have attached the status of the Pods and the Cilium:

Comments

  • chrispokorni
    Options

    Hi @abelpatel,

    How many network interfaces are defined per VM and what types of networks are attached by each (bridged, nat, host)? Is all ingress traffic allowed to each VM?

    Regards,
    -Chris

  • abelpatel
    Options

    @chrispokorni - thanks for your reply.

    Both VMs are running on Virtual Box 7.0.

    The Control Plane VM is as follows:

    Adaptor1 is NAT
    Adaptor 2 is currently Bridged - with Promiscous Mode set to "Allow All", originally the Adaptor was set to "Host-Only" but I switched to Bridged to see if that would solve my problem.

    Screenshot of network interfaces from the Control Plane VM:

    The Worker Node is

    Adaptor 1 is NAT
    Adaptor 2 is "Host-Only" with Promiscous Mode set to "Allow All".

    Screenshot of the network interfaces from Worker Node01:

  • chrispokorni
    Options

    Hi @abelpatel,

    I recommend only one single bridged adapter per VM, with promiscuous mode set to "allow all" ingress traffic to ensure the bootstrapping process binds the correct interface.

    Regards,
    -Chris

  • abelpatel
    Options

    @chrispokorni - finally got to the bottom of this issue.

    So for the sake of completion on this thread.

    My understaind is that if you install Cilium, by default it will use 10.0.0.0/8 which uses/reserves all of the Class A IPs. So this is why I was getting kicked off my own VM as it's on 10.0.0.10.

    The blog here describes installing cilium using Helm: https://blog.devgenius.io/cilium-installation-tips-17a870fdc4f2

    And using the parameters to specify the CIDR range you wish to set.

    helm repo add cilium https://helm.cilium.io/

    helm install cilium cilium/cilium --version 1.13.2 -n kube-system \ --set ipam.operator.clusterPoolIPv4PodCIDR=10.42.0.0/16 \ --set ipv4NativeRoutingCIDR=10.42.0.0/16 \ --set ipv4.enabled=true \ --set loadBalancer.mode=dsr \ --set kubeProxyReplacement=strict \ --set tunnel=disabled \ --set autoDirectNodeRoutes=true!

  • tjuanico
    Options

    Hi,
    I've got same problem with my virtual machines on Azure (private range 10.0.0.0/24) . I'm frustated. By the way, lucky there's script's for use calico or cilium. I've been installed k8scp-calico.sh and it's seems ok.

    LFD259/SOLUTIONS/s_02/k8sWorker.sh
    LFD259/SOLUTIONS/s_02/k8scp-calico.sh
    LFD259/SOLUTIONS/s_02/k8scp-cilium.sh
    LFD259/SOLUTIONS/s_02/k8scp.sh

  • andybeeching
    Options

    @abelpatel Thanks for the info re: Cilium reserving 10.0.0.0/8 - it unblocked me when setting up VMs for cp and worker in GCE (needed to specify a different subnet, e.g. 10.20.0.0/8).

    @chrispokorni Might it be worth the setup scripts detecting when someone is using a conflicting subnet (i.e. 10.0.0.0/8) and informing them (e.g. k8scp.sh)? For me that would have saved quite a few hours of hair pulling :)

  • chrispokorni
    chrispokorni Posts: 2,186
    Options

    Hi @andybeeching,

    Agreed, or the lab overview to clearly describe the importance of distinct Nodes vs Pods network subnets :wink:

    Regards,
    -Chris

Categories

Upcoming Training