Lab 3.4 - Curl timeout between cp and worker

aalang · December 2021

I have two ec2 instances built in AWS. VPC CIDR is 10.200.0.0/16. Instances are in the same subnet and security group has all ICMP and TCP allowed. I can ping between the instances and the worker node successfully joined the control plane node.

Curling the the endpoint IP or the cluster IP from the control plane results in a timeout. Curling either from the worker node (where the pod resides) returns html.

So, I assume there is a network communication issue with my setup. I have rebuilt multiple times to make sure I didn't miss something with same results. I also went through class forum threads without seeing other issues and resolutions working. I'm looking for ideas on troubleshooting.

apparmor package was uninstalled. ufw status says inactive

Here is some configuration info:
ubuntu@cp:~$ kubectl get ep,svc
NAME ENDPOINTS AGE
endpoints/kubernetes 10.200.1.30:6443 30m
endpoints/nginx 192.168.171.67:80 10m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 443/TCP 30m
service/nginx ClusterIP 10.103.194.201 80/TCP 10m

ubuntu@cp:~$ ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:91:ee:d2:7d:00 brd ff:ff:ff:ff:ff:ff
inet 10.200.1.30/24 brd 10.200.1.255 scope global dynamic ens5
valid_lft 2286sec preferred_lft 2286sec
inet6 fe80::91:eeff:fed2:7d00/64 scope link
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:e9:57:60:7f brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
4: cali6a7ba766db1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
valid_lft forever preferred_lft forever
5: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 8981 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
inet 192.168.242.64/32 scope global tunl0
valid_lft forever preferred_lft forever
8: caliba6ef350904@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8981 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
valid_lft forever preferred_lft forever
9: cali05be81d78ea@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8981 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
valid_lft forever preferred_lft forever

root@worker:~# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:23:bd:39:4d:48 brd ff:ff:ff:ff:ff:ff
inet 10.200.1.126/24 brd 10.200.1.255 scope global dynamic ens5
valid_lft 2233sec preferred_lft 2233sec
inet6 fe80::23:bdff:fe39:4d48/64 scope link
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:d1:36:28:8c brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
4: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 8981 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
inet 192.168.171.64/32 scope global tunl0
valid_lft forever preferred_lft forever
9: calif0c4653a6ca@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8981 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
valid_lft forever preferred_lft forever

chrispokorni · December 2021

Hi @aalang,

Are you in the default VPC, or have you created a custom VPC for labs. The SG should allow all protocols to all ports from all sources.

The introductory chapter includes a demo video for AWS, it may help with configuration tips.

Regards,
-Chris

aalang · December 2021

When I switched the security group rule from allow all TCP to allow all traffic it worked.

For a more advanced understanding of how kubernetes communication works, I'm assuming this means there is important UDP traffic being blocked? What non-TCP communication was being blocked that would prevent a curl request on the endpoint/node from connecting across nodes?

chrispokorni · December 2021

Hi @aalang,

Both CoreDNS and Calico use the UDP protocol. Other Kubernetes plugins may use the protocol as well.

Without UDP, Calico is not able to successfully build the cluster-wide Pod-to-Pod network across all cluster Nodes. This is one of the reasons why in the AWS set up video guide the recommendation is to allow all protocols.

Regards,
-Chris

aalang · December 2021

@chrispokorni thank you for the extra information. It is very much appreciated.

Lab 3.4 - Curl timeout between cp and worker

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)