Welcome to the Linux Foundation Forum!

[Lab 3.4] tcpdump stays empty

thomas.bucaioni
thomas.bucaioni Posts: 158
edited April 10 in LFS258 Class Forum

Hello,
Here is the setting:

$ kubectl get endpoints nginx 
NAME    ENDPOINTS                                           AGE
nginx   192.168.19.4:80,192.168.86.67:80,192.168.86.69:80   25m
$ kubectl get service nginx 
NAME    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
nginx   ClusterIP   10.100.200.161   <none>        80/TCP    27m
$ curl 10.100.200.161:80

Then the curl command sometimes shows the Nginx welcome page, sometimes not. But the tcpdump stays blank. Anything wrong? Both worker and cp are running tcpdump on the tunnel and see nothing:

sudo tcpdump -i tunl0

Comments

  • chrispokorni
    chrispokorni Posts: 1,458

    Hi @thomas.bucaioni,

    The concerning behavior of the nginx service is the "sometimes not" showing the nginx welcome page. This symptom typically indicates that the nodes are not networked together to Kubernetes' liking. Assuming the nodes are on the same network, this may be a firewall issues, if it blocks required protocols to various ports. Not opening the firewall to all traffic from all sources, all protocols, and to all port destinations as described in the set up videos may cause these types of issues. Are all the control plane pods running? What is the output of

    kubectl get pods -A

    Regards,
    -Chris

  • Hi @chrispokorni
    It could well be my firewall, it's custom... The output of kubectl get pods -A is:

    $ kubectl get pods -A
    NAMESPACE     NAME                                       READY   STATUS    RESTARTS     AGE
    default       nginx-74d589986c-4ncr8                     1/1     Running   1 (6d ago)   6d1h
    default       nginx-74d589986c-bwfkk                     1/1     Running   1 (6d ago)   6d2h
    default       nginx-74d589986c-l5p5p                     1/1     Running   1 (6d ago)   6d2h
    kube-system   calico-kube-controllers-56fcbf9d6b-bnvxg   1/1     Running   1 (6d ago)   6d4h
    kube-system   calico-node-gfrl4                          0/1     Running   1 (6d ago)   6d4h
    kube-system   calico-node-rn8pb                          0/1     Running   1            6d4h
    kube-system   coredns-64897985d-9wzkz                    1/1     Running   1 (6d ago)   6d6h
    kube-system   coredns-64897985d-ff8r8                    1/1     Running   1 (6d ago)   6d6h
    kube-system   etcd-dl-dt-03                              1/1     Running   6 (6d ago)   6d6h
    kube-system   kube-apiserver-dl-dt-03                    1/1     Running   7 (6d ago)   6d6h
    kube-system   kube-controller-manager-dl-dt-03           1/1     Running   6 (6d ago)   6d6h
    kube-system   kube-proxy-dc2dn                           1/1     Running   2 (6d ago)   6d6h
    kube-system   kube-proxy-tkhfr                           1/1     Running   2 (6d ago)   6d6h
    kube-system   kube-scheduler-dl-dt-03                    1/1     Running   6 (6d ago)   6d6h
    

    Otherwise, here is my firewall:

    $ cat bin/firewall.sh
    #!/bin/sh
    #
    # firewall.sh
    
    # WAN and LAN interfaces
    IFACE_LAN=enp2s0
    IFACE_WAN=enp0s29f7u7
    IFACE_LAN_IP=172.168.1.0/24
    
    # Accept all
    iptables -t filter -P INPUT ACCEPT
    iptables -t filter -P FORWARD ACCEPT
    iptables -t filter -P OUTPUT ACCEPT
    iptables -t nat -P INPUT ACCEPT
    iptables -t nat -P PREROUTING ACCEPT
    iptables -t nat -P POSTROUTING ACCEPT
    iptables -t nat -P OUTPUT ACCEPT
    iptables -t mangle -P INPUT ACCEPT
    iptables -t mangle -P PREROUTING ACCEPT
    iptables -t mangle -P FORWARD ACCEPT
    iptables -t mangle -P POSTROUTING ACCEPT
    iptables -t mangle -P OUTPUT ACCEPT
    
    # Reset the counters
    iptables -t filter -Z
    iptables -t nat -Z
    iptables -t mangle -Z
    
    # Delete all active rules and personalized chains
    iptables -t filter -F
    iptables -t filter -X
    iptables -t nat -F
    iptables -t nat -X
    iptables -t mangle -F
    iptables -t mangle -X
    
    # Default policy
    iptables -P INPUT DROP
    iptables -P FORWARD ACCEPT
    iptables -P OUTPUT ACCEPT
    
    # Trust ourselves
    iptables -A INPUT -i lo -j ACCEPT
    #iptables -A INPUT -i lo --dport 6443 -j ACCEPT
    #iptables -A INPUT -i lo --sport 6443 -j ACCEPT
    
    # Ping
    iptables -A INPUT -p icmp --icmp-type echo-request -j ACCEPT
    iptables -A INPUT -p icmp --icmp-type time-exceeded -j ACCEPT
    iptables -A INPUT -p icmp --icmp-type destination-unreachable -j ACCEPT
    
    # Established connections
    iptables -A INPUT -m state --state ESTABLISHED -j ACCEPT
    
    # SSH
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 22 -j ACCEPT
    #iptables -A INPUT -p tcp -i $IFACE_WAN --dport 22 -j ACCEPT
    
    #iptables -A INPUT -p tcp -i $IFACE_WAN --sport 3000 -j ACCEPT
    #iptables -A INPUT -p tcp -i $IFACE_WAN --dport 3000 -j ACCEPT
    #iptables -A INPUT -p tcp -i $IFACE_LAN --dport 3000 -j ACCEPT
    #iptables -A INPUT -p tcp -i $IFACE_LAN --sport 3000 -j ACCEPT
    
    #iptables -A INPUT -p udp -i $IFACE_WAN --sport 3000 -j ACCEPT
    #iptables -A INPUT -p udp -i $IFACE_WAN --dport 3000 -j ACCEPT
    #iptables -A INPUT -p udp -i $IFACE_LAN --dport 3000 -j ACCEPT
    #iptables -A INPUT -p udp -i $IFACE_LAN --sport 3000 -j ACCEPT
    
    # Kubernetes
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 6443 -j ACCEPT
    iptables -A INPUT -p udp -i $IFACE_LAN --dport 6443 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --sport 6443 -j ACCEPT
    iptables -A INPUT -p udp -i $IFACE_LAN --sport 6443 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 6449 -j ACCEPT
    iptables -A INPUT -p udp -i $IFACE_LAN --dport 6449 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --sport 6449 -j ACCEPT
    iptables -A INPUT -p udp -i $IFACE_LAN --sport 6449 -j ACCEPT
    
    # Dnsmasq
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 53 -j ACCEPT
    iptables -A INPUT -p udp -i $IFACE_LAN --dport 53 -j ACCEPT
    iptables -A INPUT -p udp -i $IFACE_LAN --dport 67:68 -j ACCEPT
    
    # TCP
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 80 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_WAN --dport 80 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 443 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_WAN --dport 443 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --sport 80 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_WAN --sport 80 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --sport 443 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_WAN --sport 443 -j ACCEPT
    
    # Packet forwarding activation
    iptables -t nat -A POSTROUTING -o $IFACE_WAN -s $IFACE_LAN_IP -j MASQUERADE
    sysctl -q -w net.ipv4.ip_forward=1
    
    # NFS
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 2049 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --sport 2049 -j ACCEPT
    
    # Samba
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 445 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --sport 445 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --dport 139 -j ACCEPT
    iptables -A INPUT -p tcp -i $IFACE_LAN --sport 139 -j ACCEPT
    
    # NTP
    iptables -A INPUT -p udp -i $IFACE_LAN --dport 123 -j ACCEPT
    
    # Log refused packets
    iptables -A INPUT -m limit --limit 2/min -j LOG --log-prefix "IPv4 packet rejected ++ "
    iptables -A INPUT -j DROP
    
    # Save the configuration
    service iptables save
    
  • Even after flushing the firewall, curl doesn't reach all the nodes:

    systemctl stop iptables
    systemctl disable iptables
    systemctl status iptables
    iptables --flush
    service iptables save
    cat  /etc/sysconfig/iptables
    
  • The CP is on a router connected to a box on one interface, and to the workers on the other interface.
    But the box seems to be in 192.168.x.x, which could interfere with calico?
    If I set up the calico configuration to 182.168.x.x, maybe it goes well

  • Even after changing the calico configuration to:

    apiVersion: kubeadm.k8s.io/v1beta2
    kind: ClusterConfiguration
    kubernetesVersion: 1.22.1
    controlPlaneEndpoint: "k8scp:6443"
    networking:
      podSubnet: 182.168.0.0/16
    

    the join command proposed is:

    kubeadm join 192.168.1.194:6443 --token v9ii23.bz2vgnyxttimr3tu --discovery-token-ca-cert-hash sha256:5c5c0dd3cd3e2a75a27f119cd637ee82fac7b9febb7671cf5272c16d465683ab
    
  • My router has a name already, dl-dt-03, so I guess during the install I need to replace all the k8scp with the name of the router?

  • So, I've put the worker node on the router and the cp node on the former worker node. Now the service has no endpoint:

    $ kubectl get ep nginx 
    NAME    ENDPOINTS   AGE
    nginx   <none>      9m35s
    

    Apparently, some pods are frozen:

    $ kubectl get pods -A
    NAMESPACE     NAME                                      READY   STATUS              RESTARTS   AGE
    default       nginx-74d589986c-zpc74                    0/1     ContainerCreating   0          13m
    default       nginx-85b98978db-frgdh                    0/1     ContainerCreating   0          20m
    kube-system   calico-kube-controllers-7c845d499-9l9x9   1/1     Running             0          42m
    kube-system   calico-node-xgz2k                         1/1     Running             0          42m
    kube-system   calico-node-xwvl6                         0/1     Init:0/3            0          37m
    kube-system   coredns-64897985d-hjn7c                   1/1     Running             0          44m
    kube-system   coredns-64897985d-zgxqp                   1/1     Running             0          44m
    kube-system   etcd-hp-tw-01                             1/1     Running             0          45m
    kube-system   kube-apiserver-hp-tw-01                   1/1     Running             0          45m
    kube-system   kube-controller-manager-hp-tw-01          1/1     Running             0          45m
    kube-system   kube-proxy-25xmk                          0/1     ContainerCreating   0          37m
    kube-system   kube-proxy-4q728                          1/1     Running             0          44m
    kube-system   kube-scheduler-hp-tw-01                   1/1     Running             0          45m
    
  • thomas.bucaioni
    thomas.bucaioni Posts: 158
    edited April 16

    Anyway, for the training, maybe I can run everything from the cp node without worker?

  • Finally, I created two instances at AWS, but the join command gets stuck:

    $ kubeadm join k8scp:6443 --token jlb7a6.azs6ad1ocv7nuh75 --discovery-token-ca-cert-hash sha256:0f00ba05e423ad5d51cb18343b9a97c0b0cd73b81ab5a948ee2208d1051085d5 --v=5
    I0417 14:30:10.929810   31767 join.go:405] [preflight] found NodeName empty; using OS hostname as NodeName
    I0417 14:30:10.930110   31767 initconfiguration.go:116] detected and using CRI socket: /var/run/dockershim.sock
    [preflight] Running pre-flight checks
    I0417 14:30:10.930376   31767 preflight.go:92] [preflight] Running general checks
    I0417 14:30:10.930539   31767 checks.go:245] validating the existence and emptiness of directory /etc/kubernetes/manifests
    I0417 14:30:10.930680   31767 checks.go:282] validating the existence of file /etc/kubernetes/kubelet.conf
    I0417 14:30:10.930745   31767 checks.go:282] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
    I0417 14:30:10.930831   31767 checks.go:106] validating the container runtime
    I0417 14:30:10.983222   31767 checks.go:132] validating if the "docker" service is enabled and active
    I0417 14:30:10.998636   31767 checks.go:331] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
    I0417 14:30:10.998725   31767 checks.go:331] validating the contents of file /proc/sys/net/ipv4/ip_forward
    I0417 14:30:10.998790   31767 checks.go:649] validating whether swap is enabled or not
    I0417 14:30:10.998856   31767 checks.go:372] validating the presence of executable conntrack
    I0417 14:30:10.998909   31767 checks.go:372] validating the presence of executable ip
    I0417 14:30:10.998964   31767 checks.go:372] validating the presence of executable iptables
    I0417 14:30:10.999005   31767 checks.go:372] validating the presence of executable mount
    I0417 14:30:10.999054   31767 checks.go:372] validating the presence of executable nsenter
    I0417 14:30:10.999102   31767 checks.go:372] validating the presence of executable ebtables
    I0417 14:30:10.999141   31767 checks.go:372] validating the presence of executable ethtool
    I0417 14:30:10.999199   31767 checks.go:372] validating the presence of executable socat
    I0417 14:30:10.999263   31767 checks.go:372] validating the presence of executable tc
    I0417 14:30:10.999311   31767 checks.go:372] validating the presence of executable touch
    I0417 14:30:10.999376   31767 checks.go:520] running all checks
    I0417 14:30:11.054823   31767 checks.go:403] checking whether the given node name is valid and reachable using net.LookupHost
    I0417 14:30:11.057728   31767 checks.go:618] validating kubelet version
    I0417 14:30:11.137518   31767 checks.go:132] validating if the "kubelet" service is enabled and active
    I0417 14:30:11.148454   31767 checks.go:205] validating availability of port 10250
    I0417 14:30:11.148620   31767 checks.go:282] validating the existence of file /etc/kubernetes/pki/ca.crt
    I0417 14:30:11.148651   31767 checks.go:432] validating if the connectivity type is via proxy or direct
    I0417 14:30:11.148703   31767 join.go:475] [preflight] Discovering cluster-info
    I0417 14:30:11.148756   31767 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "k8scp:6443"
    I0417 14:30:21.149488   31767 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://k8scp:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    I0417 14:30:37.057291   31767 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://k8scp:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    
    

    Any idea what goes wrong?

  • So after setting the security group to accept everything, the worker node managed to join. But the cp node says:

    # kubectl get nodes
    The connection to the server localhost:8080 was refused - did you specify the right host or port?
    
  • Calico was not running, everything is fine

  • chrispokorni
    chrispokorni Posts: 1,458

    Hi @thomas.bucaioni,

    In summary, when calico is not running then the cluster is not behaving as expected either. Calico is responsible for managing the pod network, which impacts some of the control plane pods also.

    The IP subnets should be distinct in a cluster, meaning that the pod network (calico's default 192.168.0.0/16), the node network, and eventually the services network (cluster's default 10.96.0.0/12) should not overlap. In local environments it is typical to see the pod and node networks overlap, because many private networks use the 192.168.0.0/x default subnet. This causes issues because all these IP addresses are entered into iptables, where the cluster cannot tell the difference when an IP address represents a pod and when a node.

    The k8scp entry is intended to be an alias only, not a hostname. It will help with chapter 16 on HA. You can build your cluster without the alias, but then ch 16 will require a fresh rebuild, assuming that the instructions from ch 3 are followed as presented.

    For AWS-EC2 and GCP-GCE infrastructures, you can find video guides for each environment's configuration, where VPC's and firewalls/GCs considerations are discussed as well.

    Regards,
    -Chris

  • So, in AWS everything goes well: tcpdump, the load balancer, and even the access from outside the cluster. Fixed

  • Hi @chrispokorni,
    Just saw your answer. Indeed, the reason it didn't work on my baremetal Pcs must be the overlap of Ip ranges... Thanks for confirming
    The cluster is ready for chapter 16 then, no worries.
    Starting back at chapter 1, I saw the videos and I gave it a try at Aws. Everything is clear, it works just perfect now
    Cheers,
    Thomas

Categories

Upcoming Training