Welcome to the Linux Foundation Forum!

Lab 3.* Troubles with VirtualBox and Calico

I'm trying to complete first Kubernetes labs on VirtualBox infrastructure and faced few troubles.

First of all I've got troubles with IP addresses of master and worker. My VirtualBox configuration is such that there are two IP addresses for each node - one IP for Internet access and second IP for communication between nodes.

I googled a lot and finally could complete initial setup with following configuration files.

Master (k8smaster is 192.168.56.106 in /etc/hosts):

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
kubernetesVersion: 1.19.1
controlPlaneEndpoint: "k8smaster:6443"
networking:
  podSubnet: 192.168.77.0/24
localAPIEndpoint:
  advertiseAddress: 192.168.56.106
nodeRegistration:
  kubeletExtraArgs:
    node-ip: 192.168.56.106

Calico setup is applied with CALICO_IPV4POOL_CIDR set to 192.168.77.0/24

Worker (ubuntu-vbox-k8s-master is 192.168.56.106 in /etc/hosts):

apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
discovery:
  bootstrapToken:
    token: "ojxk8t.udzdmffxeejsra6j"
    apiServerEndpoint: "ubuntu-vbox-k8s-master:6443"
    caCertHashes:
      - "sha256:633666e8932e3a9e34bcb3cc3fc8ef08b4e1e6ee1f32c11c3e13a448ae1bee3e"
nodeRegistration:
  kubeletExtraArgs:
    node-ip: "192.168.56.107"

Nodes setup seem to be ok:

[02:06][email protected][~]$ kubectl get nodes -o wide
NAME                     STATUS   ROLES    AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
ubuntu-vbox-k8s-master   Ready    master   24h   v1.19.1   192.168.56.106   <none>        Ubuntu 18.04.6 LTS   4.15.0-156-generic   docker://20.10.7
ubuntu-vbox-k8s-worker   Ready    <none>   24h   v1.19.1   192.168.56.107   <none>        Ubuntu 18.04.6 LTS   4.15.0-156-generic   docker://20.10.7

But Calico pods are not ready:

[02:06][email protected][~]$ kubectl get pods --all-namespaces 
NAMESPACE     NAME                                             READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-855445d444-zmdrn         1/1     Running   0          24h
kube-system   calico-node-lvkt6                                0/1     Running   0          24h
kube-system   calico-node-vxtfl                                0/1     Running   22         24h
kube-system   coredns-f9fd979d6-n9jjh                          1/1     Running   0          24h
kube-system   coredns-f9fd979d6-v7nkp                          1/1     Running   0          24h
kube-system   etcd-ubuntu-vbox-k8s-master                      1/1     Running   0          24h
kube-system   kube-apiserver-ubuntu-vbox-k8s-master            1/1     Running   0          24h
kube-system   kube-controller-manager-ubuntu-vbox-k8s-master   1/1     Running   0          24h
kube-system   kube-proxy-rk66j                                 1/1     Running   0          24h
kube-system   kube-proxy-x85dx                                 1/1     Running   0          24h
kube-system   kube-scheduler-ubuntu-vbox-k8s-master            1/1     Running   0          24h

And there are A LOT of such messages in kubectl logs -n kube-system calico-node-lvkt6:

bird: BGP: Unexpected connect from unknown address 192.168.56.106 (port 43225)

Could anybody help with it?

Comments

  • Gim6626
    Gim6626 Posts: 27
    edited September 23

    Update.

    For Calico pods I even could not get details though they exist in list:

    [05:23][email protected][~]$ kubectl describe pod calico-node-vxtfl
    Error from server (NotFound): pods "calico-node-vxtfl" not found
    [05:39][email protected]box-k8s-master[~]$ kubectl describe pod calico-node-lvkt6
    Error from server (NotFound): pods "calico-node-lvkt6" not found
    [05:37][email protected][~]$ kubectl get pods --all-namespaces | grep -E 'calico-node-(vxtfl|lvkt6)'
    kube-system   calico-node-lvkt6                                0/1     Running             0          27h
    kube-system   calico-node-vxtfl                                0/1     Running             22         27h
    
  • Gim6626
    Gim6626 Posts: 27

    Update #2. Seems that I've missed namespace argument in previoud comment.

    Pods descriptions show following errors.

    [06:03][email protected][~]$ kubectl describe pod -n kube-system calico-node-vxtfl
    ...
    Events:
      Type     Reason     Age                   From                             Message
      ----     ------     ----                  ----                             -------
      Warning  Unhealthy  39m (x3555 over 24h)  kubelet, ubuntu-vbox-k8s-worker  (combined from similar events): Readiness probe failed: 2021-09-23 05:56:48.777 [INFO][21110] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 192.168.56.106
      Warning  Unhealthy  36m  kubelet, ubuntu-vbox-k8s-worker  Readiness probe failed: 2021-09-23 05:59:21.886 [INFO][21516] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 192.168.56.106
      Warning  FailedMount     36m                   kubelet, ubuntu-vbox-k8s-worker  MountVolume.SetUp failed for volume "calico-node-token-hj66k" : failed to sync secret cache: timed out waiting for the condition
    ...
      Warning  Unhealthy       34m (x5 over 35m)     kubelet, ubuntu-vbox-k8s-worker  Liveness probe failed: calico/node is not ready: Felix is not live: Get "http://localhost:9099/liveness": dial tcp 127.0.0.1:9099: connect: connection refused
      Warning  Unhealthy       34m (x6 over 35m)     kubelet, ubuntu-vbox-k8s-worker  Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/bird/bird.ctl: connect: no such file or directory
      Normal   Killing         31m                   kubelet, ubuntu-vbox-k8s-worker  Container calico-node failed liveness probe, will be restarted
      Normal   Pulled          31m                   kubelet, ubuntu-vbox-k8s-worker  Container image "docker.io/calico/node:v3.20.1" already present on machine
      Normal   Created         31m                   kubelet, ubuntu-vbox-k8s-worker  Created container calico-node
      Normal   Started         31m                   kubelet, ubuntu-vbox-k8s-worker  Started container calico-node
      Warning  Unhealthy       30m (x10 over 32m)    kubelet, ubuntu-vbox-k8s-worker  Liveness probe failed: calico/node is not ready: Felix is not live: Get "http://localhost:9099/liveness": dial tcp 127.0.0.1:9099: connect: connection refused
      Warning  BackOff         7m4s (x63 over 26m)   kubelet, ubuntu-vbox-k8s-worker  Back-off restarting failed container
      Warning  Unhealthy       2m12s (x78 over 32m)  kubelet, ubuntu-vbox-k8s-worker  Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/bird/bird.ctl: connect: no such file or directory
    

    and

    [06:36][email protected][~]$ kubectl describe pod -n kube-system calico-node-lvkt6
    ...
    Events:
      Type     Reason     Age                   From                             Message
      ----     ------     ----                  ----                             -------
      Warning  Unhealthy  43m (x3532 over 24h)  kubelet, ubuntu-vbox-k8s-master  (combined from similar events): Readiness probe failed: 2021-09-23 05:53:31.616 [INFO][1275] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 10.0.2.15
      Warning  FailedMount  38m  kubelet, ubuntu-vbox-k8s-master  MountVolume.SetUp failed for volume "calico-node-token-hj66k" : failed to sync secret cache: timed out waiting for the condition
      Warning  Unhealthy    38m  kubelet, ubuntu-vbox-k8s-master  Readiness probe failed: 2021-09-23 05:58:46.404 [INFO][2264] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 10.0.2.15
    ...
      Warning  Unhealthy  37m  kubelet, ubuntu-vbox-k8s-master  Readiness probe failed: 2021-09-23 05:59:16.412 [INFO][2364] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 10.0.2.15
    ...
      Warning  Unhealthy       36m  kubelet, ubuntu-vbox-k8s-master  Readiness probe failed: 2021-09-23 06:00:11.420 [INFO][187] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 10.0.2.15
    ...
      Warning  Unhealthy  32m  kubelet, ubuntu-vbox-k8s-master  Readiness probe failed: 2021-09-23 06:04:28.986 [INFO][1100] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 10.0.2.15
      Warning  Unhealthy  3m54s (x172 over 32m)  kubelet, ubuntu-vbox-k8s-master  (combined from similar events): Readiness probe failed: 2021-09-23 06:33:08.977 [INFO][6603] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 10.0.2.15
    

    10.0.2.15 is NAT IP and it is definitely wront that Calico tries to use it but I could not find where to fix it,

  • chrispokorni
    chrispokorni Posts: 1,218

    Hi @Gim6626,

    Having multiple network interfaces on each VirtualBox VM causes more issues than helps. If you have a chance to check its official documentation, you will discover that a single bridged adapter per VM is sufficient for all networking needs. With promiscuous mode enabled to allow all inbound/ingress traffic, in addition to your current IP addressing schema for nodes and pods, you should be able to move forward with this exercise.

    I would also recommend taking a look at the latest course release, updated to Kubernetes 1.22.

    Regards,
    -Chris

  • Gim6626
    Gim6626 Posts: 27

    Hi @chrispokorni!

    Thank you for help! I remembered similar suggestion from some other my thread here and it sounds good, but there is one problem. I work on laptop and not always from one network (even at home I may be connected to one or other) and sometimes I don't control network which I am connected to and sometimes I work without Internet connection at all. But bridged mode as far as I understand it strongly depends on "parent" network to which I am connected to. That's why I preferred fully isolated VirtualBox networks.

    Regarding latest course release - thank you again, I'll take a look.

  • Gim6626
    Gim6626 Posts: 27

    Got a progress.

    With config:

    apiVersion: kubeadm.k8s.io/v1beta2
    kind: ClusterConfiguration
    kubernetesVersion: 1.21.1
    controlPlaneEndpoint: "k8smaster:6443"
    networking:
      podSubnet: 192.168.77.0/24
    ---
    apiVersion: kubeadm.k8s.io/v1beta2
    kind: InitConfiguration
    localAPIEndpoint:
      advertiseAddress: 192.168.56.108
    nodeRegistration:
      kubeletExtraArgs:
        node-ip: 192.168.56.108
    

    I succeeded to create master with correct settings and join worker to it without any more hacking.

    Now I've still got troubles with Calico, but seems that it is something new:

    [11:09][email protected][~]$ kubectl get pods --all-namespaces
    NAMESPACE     NAME                                             READY   STATUS             RESTARTS   AGE
    kube-system   calico-kube-controllers-74b8fbdb46-h2f2m         1/1     Running            2          13m
    kube-system   calico-node-9xqnr                                0/1     CrashLoopBackOff   4          2m3s
    kube-system   calico-node-g9rbt                                1/1     Running            2          13m
    kube-system   coredns-558bd4d5db-4znr7                         1/1     Running            2          14m
    kube-system   coredns-558bd4d5db-rqzfw                         1/1     Running            2          14m
    kube-system   etcd-ubuntu-vbox-k8s-master                      1/1     Running            2          14m
    kube-system   kube-apiserver-ubuntu-vbox-k8s-master            1/1     Running            2          14m
    kube-system   kube-controller-manager-ubuntu-vbox-k8s-master   1/1     Running            2          14m
    kube-system   kube-proxy-9zm2r                                 1/1     Running            2          14m
    kube-system   kube-proxy-w64xv                                 1/1     Running            1          12m
    kube-system   kube-scheduler-ubuntu-vbox-k8s-master            1/1     Running            2          14m
    

    But there is nothing interesting neither in logs nor in description for trouble pod calico-node-9xqnr:

    [11:09][email protected][~]$ kubectl logs -n kube-system -p calico-node-9xqnr
    Error from server (NotFound): the server could not find the requested resource ( pods/log calico-node-9xqnr)
    [11:09][email protected][~]$ kubectl describe pod -n kube-system calico-node-9xqnr
    ...
    Events:
      Type     Reason     Age                From               Message
      ----     ------     ----               ----               -------
      Normal   Scheduled  48s                default-scheduler  Successfully assigned kube-system/calico-node-9xqnr to ubuntu-vbox-k8s-worker
      Normal   Created    47s                kubelet            Created container upgrade-ipam
      Normal   Started    47s                kubelet            Started container upgrade-ipam
      Normal   Pulled     47s                kubelet            Container image "docker.io/calico/cni:v3.20.1" already present on machine
      Normal   Pulled     46s                kubelet            Container image "docker.io/calico/cni:v3.20.1" already present on machine
      Normal   Created    46s                kubelet            Created container install-cni
      Normal   Created    45s                kubelet            Created container flexvol-driver
      Normal   Started    45s                kubelet            Started container install-cni
      Normal   Pulled     45s                kubelet            Container image "docker.io/calico/pod2daemon-flexvol:v3.20.1" already present on machine
      Normal   Started    44s                kubelet            Started container flexvol-driver
      Normal   Pulled     24s (x3 over 44s)  kubelet            Container image "docker.io/calico/node:v3.20.1" already present on machine
      Normal   Created    24s (x3 over 43s)  kubelet            Created container calico-node
      Normal   Started    24s (x3 over 43s)  kubelet            Started container calico-node
      Warning  Unhealthy  24s                kubelet            Readiness probe failed:
      Warning  BackOff    18s (x6 over 42s)  kubelet            Back-off restarting failed container
    
  • Gim6626
    Gim6626 Posts: 27

    Update.

    Trouble pod got in describe:

    Node:                 ubuntu-vbox-k8s-worker/10.0.3.15
    

    So it is still wrong IP.

    Google a lot and found how to fix Calico node IP by calicoctl but I got conflict.

    Current worker config is:

    [12:35][email protected][~]$ ./calicoctl get node ubuntu-vbox-k8s-worker -o yaml
    apiVersion: projectcalico.org/v3
    kind: Node
    metadata:
      annotations:
        projectcalico.org/kube-labels: '{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"ubuntu-vbox-k8s-worker","kubernetes.io/os":"linux"}'
      creationTimestamp: "2021-09-26T10:59:00Z"
      labels:
        beta.kubernetes.io/arch: amd64
        beta.kubernetes.io/os: linux
        kubernetes.io/arch: amd64
        kubernetes.io/hostname: ubuntu-vbox-k8s-worker
        kubernetes.io/os: linux
      name: ubuntu-vbox-k8s-worker
      resourceVersion: "8546"
      uid: 6f5efd41-e06c-4f9d-9b3a-248af88a385e
    spec:
      addresses:
      - address: 10.0.3.15
        type: InternalIP
      orchRefs:
      - nodeName: ubuntu-vbox-k8s-worker
        orchestrator: k8s
    status: {}
    

    I try to apply:

    [12:37][email protected][~]$ cat calico-worker.yaml 
    apiVersion: projectcalico.org/v3
    kind: Node
    metadata:
      annotations:
        projectcalico.org/kube-labels: '{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"ubuntu-vbox-k8s-worker","kubernetes.io/os":"linux"}'
      creationTimestamp: "2021-09-26T10:59:00Z"
      labels:
        beta.kubernetes.io/arch: amd64
        beta.kubernetes.io/os: linux
        kubernetes.io/arch: amd64
        kubernetes.io/hostname: ubuntu-vbox-k8s-worker
        kubernetes.io/os: linux
      name: ubuntu-vbox-k8s-worker
      resourceVersion: "5303"
      uid: 6f5efd41-e06c-4f9d-9b3a-248af88a385e
    spec:
      addresses:
      - address: 192.168.56.109/24
        type: CalicoNodeIP
      - address: 192.168.56.109
        type: InternalIP
      orchRefs:
      - nodeName: ubuntu-vbox-k8s-worker
        orchestrator: k8s
    status: {}
    [12:38][email protected][~]$ ./calicoctl apply -f calico-worker.yaml 
    Failed to apply 'Node' resource: [update conflict: Node(ubuntu-vbox-k8s-worker)]
    

    It is strange cause I succeeded to fix Calico master node in same way without conflict.

  • chrispokorni
    chrispokorni Posts: 1,218

    Hi @Gim6626,

    Since each VM has 2 network interfaces, what are their IP addresses?

    Regards,
    -Chris

  • pkon
    pkon Posts: 1

    Hi @Gim6626

    Not sure if you already fixed this issue, but maybe this will help:
    https://forum.linuxfoundation.org/discussion/857721/lab-3-3-3-4-calico-node-is-not-ready

    In my example what I've just initialized it by @emiliano.sutil advice with:
    kubeadm init --apiserver-advertise-address=10.10.10.1 --apiserver-cert-extra-sans=10.10.10.1 --pod-network-cidr=192.168.0.0/16 --control-plane-endpoint=k8scp:6443 --upload-certs

    I've tried creating yaml file, however it didn't worked as I hoped:
    apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration kubernetesVersion: 1.22.1 controlPlaneEndpoint: "k8scp:6443" networking: podSubnet: 192.168.0.0/16 apiserver: extraArgs: advertise-address: 10.10.10.1 cert-extra-sans: 10.10.10.1
    Still my main ip seemed to be the VirtualBox NAT one, not the 10.10.10.1 (VMs internal network). I will come back to it on later, as this problem took too much time already.

  • Gim6626
    Gim6626 Posts: 27

    @chrispokorni said:
    Hi @Gim6626,

    Since each VM has 2 network interfaces, what are their IP addresses?

    Regards,
    -Chris

    Hi!

    Sorry for late reply - was on vacation.
    Master - 192.168.56.108
    Worker - 192.168.56.109

  • Gim6626
    Gim6626 Posts: 27

    NAT address is same for both machines - 10.0.3.15

  • Gim6626
    Gim6626 Posts: 27

    Solved with help from Calico Slack channel by command kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=interface=enp0s3 from https://docs.projectcalico.org/networking/ip-autodetection

Categories

Upcoming Training