GCE worker node connection issue

amouskite · December 2019

Hi,

I have setup a 2 node cluster. The worker node is in GCE. I created the instance using a new VPC (non default) which allows all traffic.
I used the default subnets/regions to define the new VPC!
I could deploy my basic pod. I could deploy my basic service but for some reason I cannot reach the service from the other node where the pod is not deployed..

k get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
basicservice ClusterIP 10.109.212.68 80/TCP 131m type=webserver
kubernetes ClusterIP 10.96.0.1 443/TCP 42h

I deactivated the apparmor and ufw.

Here is my IP Tables on the GCE node : iptables.txt

Any ideas why this cluster IP of the service is not reachable and how these IP ranges 10.109... are defined/assigned

Many thanks

amouskite · December 2019

So I went back again to this issue and made some trys to get rid of the issue.

First, I installed Ubutunu 18.04, instead of Ubuntu 19.04, on my 2 nodes.

The master node is a fully managed VPS
The worker node is a a Cloud GCE instance

So on completely different networks.

I also made sure that both nodes can talk to each other on all ports

In VPS node using ufw :
sudo ufw allow from 34.89.192.175
in GCE instance using firewall rules: allowAll rule

After that, I have tried again to redo the same steps; installing the master node, untainting it, installing the worker node...
The whole thing did not help since I got exactly the same issues in the same order:

First, once the worker node joined the cluster, both Calico pods stopped being in ready state
After analyzing the Calico pods logs, I found out that the worker node was using the internal IP address defined in the Google VPC (instead of the external one). It looks like the Calico pod on the master node could not reach out to the worker node.
So, I decided to add a route to the master so that all calls to the internal IP address of the worker node are redirected to its actual external IP
sudo iptables -t nat -I OUTPUT --dest 10.156.0.2/32 -j DNAT --to-dest 34.89.192.175

Listing the routes, I could see an extra line for 10.156.0.2 that was not there before:

~/k8s/lab_02$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 207.180.230.1 0.0.0.0 UG 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.15.0 10.156.0.2 255.255.255.192 UG 0 0 0 tunl0
192.168.181.128 0.0.0.0 255.255.255.255 UH 0 0 0 cali5759f80f63a
192.168.181.128 0.0.0.0 255.255.255.192 U 0 0 0 *
192.168.181.130 0.0.0.0 255.255.255.255 UH 0 0 0 calie464868f422
192.168.181.131 0.0.0.0 255.255.255.255 UH 0 0 0 cali5ea36435324

~/k8s/lab_02$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6bbf58546b-2ltv2 1/1 Running 0 5h16m
calico-node-cmb2s 1/1 Running 0 3h26m
calico-node-nh5dd 1/1 Running 0 3h26m
coredns-5644d7b6d9-j2jjq 1/1 Running 0 5h16m
coredns-5644d7b6d9-xtb62 1/1 Running 0 5h16m
etcd-master 1/1 Running 0 5h15m
kube-apiserver-master 1/1 Running 0 5h15m
kube-controller-managermaster 1/1 Running 0 5h15m
kube-proxy-bkj6m 1/1 Running 12 4h28m
kube-proxy-f46j7 1/1 Running 0 5h16m
kube-scheduler-master 1/1 Running 0 5h15m

After that the Calico pods stopped complaining and were again in a ready state.
So, the problem of Calico solved, I went forward with the course steps and then faced the same issue with the service.
I intentionally put back the taint to forbid scheduling on the master node.

Curling the service from the master node (where the pod is not deployed) does not work.
It works however when done inside the worker pod.

I run the pod and then the service.

~/k8s/lab_02$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
basic-service ClusterIP 10.103.53.132 80/TCP 30m
kubernetes ClusterIP 10.96.0.1 443/TCP 4h43m

Checked the iptables rules added by kube-proxy on each node

~/k8s/lab_02$ sudo iptables-save | grep 10.103.53.132
-A KUBE-SERVICES ! -s 192.168.0.0/16 -d 10.103.53.132/32 -p tcp -m comment --comment "default/basic-service: cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.103.53.132/32 -p tcp -m comment --comment "default/basic-service: cluster IP" -m tcp --dport 80 -j KUBE-SVC-BU6KMELWBNFQMJ6Y

During all this analysis, I found out that the services IP range is given as a parameter to the API Server on startup and that the services IPs are virtual and are supposed to be translated at some point by kube-proxy.

See etc/kubernetes/manifests/kube-apiserver.yaml

Now I suspect that those rules written by the kube-proxy are not enough to redirect the traffic to the worker node/pod
I also tried to change that IP address shared by the worked node.. I changed the kubelet startup files by giving the external address as discussed here https://github.com/kubernetes/kubeadm/issues/203.
The result was not as expected since the worker node had no IP at all.

So basically what I still did not do is check what happens exactly when I call the service IP. I do not find any relevant information anywhere. Doing an mtr to trace the call shows only the DNS server of the hosting provider, which probably means that the whole iptables rules did not work well..

Any help would be very appreciated

amouskite · December 2019

To more extend my analysis, I used tshark to sniff the network..
At my surprise, the master node could "translate" the service IP to the Pod Ip and send requests to the worker node.
Checking the logs of wireshark on both nodes, I arrived to the conclusion that :
1. Master node can send requests to the worker node using the right Pod IP
2. Worker node tries to send requests back to the master node but the they hang somewhere and never reach the Tunnel interface on the master.

Here are the outputs for a curl http://10.103.53.132 from the master node:

Network packets on the master node

~/k8s/lab_02$ cat network.log
1 0.000000000 192.168.181.129 → 192.168.15.3 TCP 60 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498264725 TSecr=0 WS=128
2 1.004025910 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498265729 TSecr=0 WS=128
3 3.020038791 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498267745 TSecr=0 WS=128
4 7.180030621 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498271905 TSecr=0 WS=128
5 15.372232344 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498280098 TSecr=0 WS=128
6 31.500093081 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498296226 TSecr=0 WS=128
7 64.524046610 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498329250 TSecr=0 WS=128

Network packets on worker node

~/k8s/lab_02$ cat network.log
1 0.000000000 192.168.181.129 → 192.168.15.3 TCP 60 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498264725 TSecr=0 WS=128
2 0.000155868 192.168.15.3 → 192.168.181.129 TCP 60 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110368944 TSecr=1498264725 WS=128
3 1.004588087 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498265729 TSecr=0 WS=128
4 1.004671123 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110369948 TSecr=1498264725 WS=128
5 2.013643936 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110370957 TSecr=1498264725 WS=128
6 3.020353932 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498267745 TSecr=0 WS=128
7 3.020437281 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110371964 TSecr=1498264725 WS=128
8 5.021649689 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110373965 TSecr=1498264725 WS=128
9 7.178294371 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498271905 TSecr=0 WS=128
10 7.178400688 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110376122 TSecr=1498264725 WS=128
11 11.229621154 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110380173 TSecr=1498264725 WS=128
12 15.372054389 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498280098 TSecr=0 WS=128
13 15.372137631 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110384316 TSecr=1498264725 WS=128
14 23.517611008 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110392461 TSecr=1498264725 WS=128
15 31.500269555 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498296226 TSecr=0 WS=128
16 31.500385133 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110400444 TSecr=1498264725 WS=128
17 47.581694573 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110416525 TSecr=1498264725 WS=128
18 64.525004821 192.168.181.129 → 192.168.15.3 TCP 60 [TCP Retransmission] 37469 → 80 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1498329250 TSecr=0 WS=128
19 64.525090416 192.168.15.3 → 192.168.181.129 TCP 60 [TCP Retransmission] 80 → 37469 [SYN, ACK] Seq=0 Ack=1 Win=65236 Len=0 MSS=1400 SACK_PERM=1 TSval=3110433469 TSecr=1498264725 WS=128

chrispokorni · December 2019

Hi,

The scope of this course is to teach you Kubernetes, with hands-on lab exercises designed around the topics discussed in the lecture sections. The installation and bootstrapping of the Kubernetes cluster has been simplified to eliminate instance networking issues which otherwise would impact the cluster's behavior - specifically accessing Service ClusterIPs and Pod IPs from different nodes of the cluster. This allows you to focus on Kubernetes topics without spending too much time with overall infrastructure configuration.

The following infrastructure configuration allows a Kubernetes cluster to run efficiently on GCP:

VPC - a custom VPC network created in the GCP account.
Firewall - a custom firewall rule for the custom VPC network created in the previous step, where you allow all ingress traffic from all sources, all protocols, to all ports.
GCE instances - create at least 2 GCE instances inside the custom VPC network to be able to follow along with the lab exercises. Instances sized with 2 vCPUs and 7.5 GB memory work just fine, the recommended OS image Ubuntu 18.04 LTS, while making sure to pick your custom VPC network.

These simple steps should allow you to follow along with the lab exercises as they are described in the lab manual, with outputs consistent with the ones presented.

Regards,
-Chris

amouskite · December 2019

Hi,

The time I spend on the course is at my own discretion.. I do not want to blindly follow the instructions without understanding the Hows..Besides nothing in the course states that all instances should be in the GCP. I could also make everything local!
Anyway.. thanks for your time to reply

GCE worker node connection issue

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)