Welcome to the Linux Foundation Forum!

[clusterip, endpoint] ping not working

Hello,
Just retried an installation from scratch on aws and it seems ok: cp and worker nodes see each other.
Then I expose a simple deployment as in lab 3.4 item 17, which goes on the worker node. But curl and even ping don't see neither the ClusterIP nor the EndPoint.
Any idea what went wrong? The security group is wide open and ufw is disabled on both nodes

Best Answers

  • leopastorsdg
    leopastorsdg Posts: 14
    Answer ✓

    Maybe checking the network plugin setup?

  • chrispokorni
    chrispokorni Posts: 1,656
    Answer ✓

    Hi @thomas.bucaioni,

    For AWS you can find in Chapter 1 a video guide which outlines VPC and SG configuration options to ensure that the cluster nodes can fully access each other, which would eventually allow cross pinging and curling of service ClusterIP addresses and endpoint pod IP addresses.

    Regards,
    -Chris

Answers

  • Maybe 192.168.0.0/16 is a bit too much for aws... Locally it works with /24, I'll retry there with the same setting
    Thanks

  • anro
    anro Posts: 3
    edited May 2022

    Hi,
    Same issue here. @thomas.bucaioni were you able to find the solution?
    My setup:
    k-master eth0: 10.1.1.4/24
    k-worker eth0: 10.1.1.5/24

    calico net: 172.20.0.0/24

    $ kubectl get nodes -o wide
    NAME       STATUS   ROLES                  AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
    k-master   Ready    control-plane,master   6d20h   v1.22.1   10.1.1.4      <none>        Ubuntu 20.04.4 LTS   5.13.0-1025-azure   docker://20.10.12
    k-worker   Ready    <none>                 6d20h   v1.22.1   10.1.1.5      <none>        Ubuntu 20.04.4 LTS   5.13.0-1025-azure   docker://20.10.12
    
    $ kubectl get svc -o wide --all-namespaces
    NAMESPACE     NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE     SELECTOR
    default       kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP                  6d20h   <none>
    default       nginx        ClusterIP   10.105.250.80   <none>        80/TCP                   4d20h   app=nginx
    kube-system   kube-dns     ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP,9153/TCP   6d20h   k8s-app=kube-dns
    
    $ kubectl get ep --all-namespaces
    NAMESPACE     NAME         ENDPOINTS                                                     AGE
    default       kubernetes   10.1.1.4:6443                                                 6d20h
    default       nginx        172.20.0.141:80,172.20.0.144:80,172.20.0.145:80               4d20h
    kube-system   kube-dns     172.20.0.142:53,172.20.0.143:53,172.20.0.142:53 + 3 more...   6d20h
    
    $ kubectl get pods --all-namespaces -o wide
    NAMESPACE     NAME                                       READY   STATUS    RESTARTS      AGE     IP             NODE       NOMINATED NODE   READINESS GATES
    default       nginx-7848d4b86f-4km4g                     1/1     Running   2 (39m ago)   4d19h   172.20.0.145   k-worker   <none>           <none>
    default       nginx-7848d4b86f-8sm9r                     1/1     Running   2 (39m ago)   4d20h   172.20.0.144   k-worker   <none>           <none>
    default       nginx-7848d4b86f-cn9f4                     1/1     Running   2 (39m ago)   4d19h   172.20.0.141   k-worker   <none>           <none>
    kube-system   calico-kube-controllers-685b65ddf9-c5qtz   1/1     Running   2 (37m ago)   6d20h   172.20.0.5     k-master   <none>           <none>
    kube-system   calico-node-c8p87                          1/1     Running   2 (37m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
    kube-system   calico-node-z89pr                          1/1     Running   2 (39m ago)   6d20h   10.1.1.5       k-worker   <none>           <none>
    kube-system   coredns-78fcd69978-qmlgh                   1/1     Running   2 (39m ago)   6d20h   172.20.0.143   k-worker   <none>           <none>
    kube-system   coredns-78fcd69978-w5b2b                   1/1     Running   2 (37m ago)   6d20h   172.20.0.142   k-worker   <none>           <none>
    kube-system   etcd-k-master                              1/1     Running   2 (39m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
    kube-system   kube-apiserver-k-master                    1/1     Running   2 (39m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
    kube-system   kube-controller-manager-k-master           1/1     Running   3 (37m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
    kube-system   kube-proxy-5mhkn                           1/1     Running   2 (37m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
    kube-system   kube-proxy-jlh6w                           1/1     Running   2 (39m ago)   6d20h   10.1.1.5       k-worker   <none>           <none>
    kube-system   kube-scheduler-k-master                    1/1     Running   3 (39m ago)   6d20h   10.1.1.4       k-master   <none>           <none>
    

    curl to nginx does not work from master node but it works from worker where nginx is deployed.

    $ ip r g 172.20.0.141
    172.20.0.141 via 10.1.1.5 dev tunl0 src 172.20.0.0 uid 1000
        cache expires 509sec mtu 1480
    

    tcpdump on master shows packets in tunl0:

    [email protected]:~# tcpdump -i tunl0
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
    11:26:59.253059 IP k-master.54410 > 172.20.0.141.http: Flags [S], seq 3382042766, win 64800, options [mss 1440,sackOK,TS val 1329149067 ecr 0,nop,wscale 7], length 0
    11:27:00.284389 IP k-master.54410 > 172.20.0.141.http: Flags [S], seq 3382042766, win 64800, options [mss 1440,sackOK,TS val 1329150099 ecr 0,nop,wscale 7], length 0
    11:27:02.300347 IP k-master.54410 > 172.20.0.141.http: Flags [S], seq 3382042766, win 64800, options [mss 1440,sackOK,TS val 1329152115 ecr 0,nop,wscale 7], length 0
    

    while there is noting on worker.

    Any ideas what to check?

  • anro
    anro Posts: 3

    ok, I guess I've found the problem:

    Does Azure support Calico networking?

    Calico in VXLAN mode is supported on Azure. However, IPIP packets are blocked by the Azure network fabric.

    Is there a way to convert lab setup to use vxlan?

  • anro
    anro Posts: 3

    I was able to change calico to vxlan:

    get current ipPool config

    calicoctl get ipPool default-ipv4-ippool -o yaml | tee default-pool.yaml
    

    change settings in default-pool.yaml:

      ipipMode: Never
      vxlanMode: Always
    

    apply changes:

    calicoctl replace -f default-pool.yaml
    

    curl to nginx endpoint works now. However it is not clear whether this will lead to more issues in the following labs.

  • chrispokorni
    chrispokorni Posts: 1,656

    Hi @anro,

    As you discovered, on Azure the calico CNI network plugin does require additional configuration steps, which otherwise are not needed on AWS, GCP, even on some local hypervisors.

    Regards,
    -Chris

  • Hi @chrispokorni
    Indeed, if I remember well, I opened TCP and UDP traffic but not "All traffic" as specified in the video. This must be it

    Hi @anro
    Obviously, these internal networks between pods and nodes are the point of pain...

Categories

Upcoming Training