[clusterip, endpoint] ping not working

thomas.bucaioni · May 2022

Hello,
Just retried an installation from scratch on aws and it seems ok: cp and worker nodes see each other.
Then I expose a simple deployment as in lab 3.4 item 17, which goes on the worker node. But curl and even ping don't see neither the ClusterIP nor the EndPoint.
Any idea what went wrong? The security group is wide open and ufw is disabled on both nodes

leopastorsdg · May 2022

Maybe checking the network plugin setup?

chrispokorni · May 2022

Hi @thomas.bucaioni,

For AWS you can find in Chapter 1 a video guide which outlines VPC and SG configuration options to ensure that the cluster nodes can fully access each other, which would eventually allow cross pinging and curling of service ClusterIP addresses and endpoint pod IP addresses.

Regards,
-Chris

thomas.bucaioni · May 2022

Maybe 192.168.0.0/16 is a bit too much for aws... Locally it works with /24, I'll retry there with the same setting
Thanks

anro · May 2022

Hi,
Same issue here. @thomas.bucaioni were you able to find the solution?
My setup:
k-master eth0: 10.1.1.4/24
k-worker eth0: 10.1.1.5/24

calico net: 172.20.0.0/24

$ kubectl get nodes -o wide
NAME       STATUS   ROLES                  AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
k-master   Ready    control-plane,master   6d20h   v1.22.1   10.1.1.4      <none>        Ubuntu 20.04.4 LTS   5.13.0-1025-azure   docker://20.10.12
k-worker   Ready    <none>                 6d20h   v1.22.1   10.1.1.5      <none>        Ubuntu 20.04.4 LTS   5.13.0-1025-azure   docker://20.10.12

$ kubectl get svc -o wide --all-namespaces
NAMESPACE     NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE     SELECTOR
default       kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP                  6d20h   <none>
default       nginx        ClusterIP   10.105.250.80   <none>        80/TCP                   4d20h   app=nginx
kube-system   kube-dns     ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP,9153/TCP   6d20h   k8s-app=kube-dns

$ kubectl get ep --all-namespaces
NAMESPACE     NAME         ENDPOINTS                                                     AGE
default       kubernetes   10.1.1.4:6443                                                 6d20h
default       nginx        172.20.0.141:80,172.20.0.144:80,172.20.0.145:80               4d20h
kube-system   kube-dns     172.20.0.142:53,172.20.0.143:53,172.20.0.142:53 + 3 more...   6d20h

$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                       READY   STATUS    RESTARTS      AGE     IP             NODE       NOMINATED NODE   READINESS GATES
default       nginx-7848d4b86f-4km4g                     1/1     Running   2 (39m ago)   4d19h   172.20.0.145   k-worker   <none>           <none>
default       nginx-7848d4b86f-8sm9r                     1/1     Running   2 (39m ago)   4d20h   172.20.0.144   k-worker   <none>           <none>
default       nginx-7848d4b86f-cn9f4                     1/1     Running   2 (39m ago)   4d19h   172.20.0.141   k-worker   <none>           <none>
kube-system   calico-kube-controllers-685b65ddf9-c5qtz   1/1     Running   2 (37m ago)   6d20h   172.20.0.5     k-master   <none>           <none>
kube-system   calico-node-c8p87                          1/1     Running   2 (37m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
kube-system   calico-node-z89pr                          1/1     Running   2 (39m ago)   6d20h   10.1.1.5       k-worker   <none>           <none>
kube-system   coredns-78fcd69978-qmlgh                   1/1     Running   2 (39m ago)   6d20h   172.20.0.143   k-worker   <none>           <none>
kube-system   coredns-78fcd69978-w5b2b                   1/1     Running   2 (37m ago)   6d20h   172.20.0.142   k-worker   <none>           <none>
kube-system   etcd-k-master                              1/1     Running   2 (39m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
kube-system   kube-apiserver-k-master                    1/1     Running   2 (39m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
kube-system   kube-controller-manager-k-master           1/1     Running   3 (37m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
kube-system   kube-proxy-5mhkn                           1/1     Running   2 (37m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
kube-system   kube-proxy-jlh6w                           1/1     Running   2 (39m ago)   6d20h   10.1.1.5       k-worker   <none>           <none>
kube-system   kube-scheduler-k-master                    1/1     Running   3 (39m ago)   6d20h   10.1.1.4       k-master   <none>           <none>

curl to nginx does not work from master node but it works from worker where nginx is deployed.

$ ip r g 172.20.0.141
172.20.0.141 via 10.1.1.5 dev tunl0 src 172.20.0.0 uid 1000
    cache expires 509sec mtu 1480

tcpdump on master shows packets in tunl0:

root@k-master:~# tcpdump -i tunl0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
11:26:59.253059 IP k-master.54410 > 172.20.0.141.http: Flags [S], seq 3382042766, win 64800, options [mss 1440,sackOK,TS val 1329149067 ecr 0,nop,wscale 7], length 0
11:27:00.284389 IP k-master.54410 > 172.20.0.141.http: Flags [S], seq 3382042766, win 64800, options [mss 1440,sackOK,TS val 1329150099 ecr 0,nop,wscale 7], length 0
11:27:02.300347 IP k-master.54410 > 172.20.0.141.http: Flags [S], seq 3382042766, win 64800, options [mss 1440,sackOK,TS val 1329152115 ecr 0,nop,wscale 7], length 0

while there is noting on worker.

Any ideas what to check?

anro · May 2022

ok, I guess I've found the problem:

Does Azure support Calico networking?

Calico in VXLAN mode is supported on Azure. However, IPIP packets are blocked by the Azure network fabric.

Is there a way to convert lab setup to use vxlan?

anro · May 2022

I was able to change calico to vxlan:

get current ipPool config

calicoctl get ipPool default-ipv4-ippool -o yaml | tee default-pool.yaml

change settings in default-pool.yaml:

  ipipMode: Never
  vxlanMode: Always

apply changes:

calicoctl replace -f default-pool.yaml

curl to nginx endpoint works now. However it is not clear whether this will lead to more issues in the following labs.

chrispokorni · May 2022

Hi @anro,

As you discovered, on Azure the calico CNI network plugin does require additional configuration steps, which otherwise are not needed on AWS, GCP, even on some local hypervisors.

Regards,
-Chris

thomas.bucaioni · May 2022

Hi @chrispokorni
Indeed, if I remember well, I opened TCP and UDP traffic but not "All traffic" as specified in the video. This must be it

Hi @anro
Obviously, these internal networks between pods and nodes are the point of pain...

[clusterip, endpoint] ping not working

Best Answers

Answers

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)