Lab 3.4 - Lab 9.1 Control plane can't reach worker's pod

fabio.sasso · June 2022

Hello, I'm trying to follow all the LAB with success but I have 2 issue in Lab 3.4: Deploy A Simple Application and in Lab 9.1: Services.
Basically from the control plane (10.0.1.5) I cannot reach pod network (192.168.0.0/16) in the worker node (10.0.1.4).
The cluster is in ready state and the pods are all running.
This is the output from LAB 3.1:

azureadm@k8scp:~$ kubectl get svc,ep -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/kubernetes ClusterIP 10.96.0.1 443/TCP 19h
service/nginx ClusterIP 10.100.75.41 80/TCP 7h38m app=nginx

NAME ENDPOINTS AGE
endpoints/kubernetes 10.0.1.5:6443 19h
endpoints/nginx 192.168.103.210:80 7h38m
azureadm@k8scp:~$

I cannot curl 192.168.103.210:80 without getting a timeout.
curling from worker node succeed.
Thanks

sirgabriel · June 2022

I have the same problem

chrispokorni · July 2022

Hi @fabio.sasso,

Are you by any chance running your cluster on Azure cloud infrastructure?

Regards,
-Chris

chrispokorni · July 2022

Hi @sirgabriel,

Please provide some descriptive details about your infrastructure and cluster, so we can understand the "problem".

Regards,
-Chris

fabio.sasso · July 2022

@chrispokorni said:
Hi @fabio.sasso,

Are you by any chance running your cluster on Azure cloud infrastructure?

Regards,
-Chris

Hello Chris,
Exactly I'm on Azure. from NSG side it seems all ok.
Please let me know what I can check.

Thanks a lot!

chrispokorni · July 2022

Hi @fabio.sasso,

Azure is known to require additional and very specific configuration options for the network plugins to work with Azure's own networking implementation. However, that is not needed by the recommended cloud infrastructure providers - AWS and GCP, nor the local solutions with hypervisors such as VirtualBox, KVM, or VMware.

While we do not support the lab exercises on the Azure infrastructure, there are a few forum posts from learners who attempted to run them on Azure, with lessons learned and networking configuration tips.

Regards,
-Chris

cn20230504 · May 2023

@chrispokorni As it happens, I am running my cluster on AWS but I am finding that as @fabio.sasso reports I cannot get the curl commands of Lab 9.1 step 12 to work from the control plane node of my cluster, but it works from the worker node just fine. The nodes are built using Ubuntu 20.04.

jmfwss · March 2024

I have exactly the same problem, however, my lab environment comprises three VirtualBoxVMs and Cilium on a Ubuntu 22.04 host.

Regards

chrispokorni · March 2024

Hi @jmfwss,

To correctly configure the networking for your VMs, on VirtualBox please ensure you enable promiscuous mode to allow all inbound traffic, and use the recommended guest OS - Ubuntu 20.04 LTS.

Regards,
-Chris

jmfwss · March 2024

I have promiscuous mode enabled and my guest OSs are all Ubuntu 20.04 LTS. However, I cannot reach the pods running on the cluster network. Currently, I have Pods running in three Networks:

NAMESPACE         NAME                                               READY   STATUS    RESTARTS         AGE    IP              NODE      NOMINATED NODE   READINESS GATES
accounting        nginx-one-5bdc6ddf4b-cvr46                         1/1     Running   1 (3h26m ago)    23h    192.168.1.223   worker1   <none>           <none>
accounting        nginx-one-5bdc6ddf4b-ft2vd                         1/1     Running   1 (3h26m ago)    23h    192.168.1.122   worker1   <none>           <none>
default           nfs-subdir-external-provisioner-86bcbb46d7-xjknw   1/1     Running   2 (3h26m ago)    2d1h   192.168.1.238   worker1   <none>           <none>
kube-system       cilium-operator-788c7d7585-8mdzg                   1/1     Running   8 (3h26m ago)    125d   10.10.10.21     worker1   <none>           <none>
kube-system       cilium-operator-788c7d7585-p86rv                   1/1     Running   9 (3h26m ago)    125d   10.0.2.15       cp        <none>           <none>
kube-system       cilium-qk8pn                                       1/1     Running   9 (3h26m ago)    140d   10.0.2.15       cp        <none>           <none>
kube-system       cilium-sdsfv                                       1/1     Running   10 (3h25m ago)   137d   10.10.10.22     worker2   <none>           <none>
kube-system       cilium-wxdml                                       1/1     Running   8 (3h26m ago)    140d   10.10.10.21     worker1   <none>           <none>
kube-system       coredns-5d78c9869d-7mxs4                           1/1     Running   8 (3h26m ago)    125d   192.168.1.40    worker1   <none>           <none>
kube-system       coredns-5d78c9869d-zr7rj                           1/1     Running   9 (3h26m ago)    125d   192.168.0.80    cp        <none>           <none>
kube-system       etcd-cp                                            1/1     Running   10 (3h26m ago)   125d   10.0.2.15       cp        <none>           <none>
kube-system       kube-apiserver-cp                                  1/1     Running   10 (3h26m ago)   125d   10.0.2.15       cp        <none>           <none>
kube-system       kube-controller-manager-cp                         1/1     Running   10 (3h26m ago)   125d   10.0.2.15       cp        <none>           <none>
kube-system       kube-proxy-76jpk                                   1/1     Running   9 (3h26m ago)    125d   10.0.2.15       cp        <none>           <none>
kube-system       kube-proxy-h9fll                                   1/1     Running   8 (3h26m ago)    125d   10.10.10.21     worker1   <none>           <none>
kube-system       kube-proxy-pqqm7                                   1/1     Running   10 (3h25m ago)   125d   10.10.10.22     worker2   <none>           <none>
kube-system       kube-scheduler-cp                                  1/1     Running   10 (3h26m ago)   125d   10.0.2.15       cp        <none>           <none>
low-usage-limit   limited-hog-66d5cd76bc-q6rmz                       1/1     Running   10 (3h25m ago)   116d   192.168.2.226   worker2   <none>           <none>

10.0.2.15 is the standard address assigned to the VM by Virtualbox, 10.10.10.* is the network range I assigned to the VMS in the Vagrantfile, 192.168.. is the cluster CIDR as specified in spec.containers.command[] (--cluster-cidr=192.168.0.0/16) of /etc/kubernetes/manifests/kube-controller-manager.yaml and in data.cluster-pool-ipv4-cidr of the Cilium config map.

Regards
Jens

chrispokorni · March 2024

Hi @jmfwss,

There are two host networks that are inconsistently picked by the Kubernetes components and plugins. That is why the routing is not behaving as expected.
Eliminate one of the node networks, keeping a single bridged network interface per VirtualBox VM with a DHCP server configured on any private subnet (because Cilium supports multiple overlapping network subnets, as opposed to other network plugins). For clarity, however, a node network of 10.200.0.0/16 or smaller may help to understand how Kubernetes and its plugins' IP addresses are utilized.

Regards,
-Chris

jmfwss · March 2024

Hi Chris,
thanks for your help. After digging into the network configuration, some googling, and reanalyzing my setup, I found the cause for my problem: a missing double quote in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf where I set the value of environment variable KUBELET_EXTRA_ARGS to --node-ip=10.10.10.11. m(

Regards,
Jens

Lab 3.4 - Lab 9.1 Control plane can't reach worker's pod

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)