Welcome to the Linux Foundation Forum!
New training version with calico does not work [LAB 3.3]
In Lab 3.3 at step 20 there is no response from curl method (after some time it returns timeout). It worked with flannel. With calico it seems to block connection to the both Cluster IP and Enpoint IP. In addition ping from master node to worker node works but there is timeout trying to ping Cluster IP or Endpoint IP from master node (with flannel it worked). All steps before were done according to the lab instruction.
Any idea what could be wrong?
0
Comments
Hello,
Thank you for the feedback. Would it be possible to paste the last few commands, and their output? As you mentioned it hangs, but perhaps something in that and the previous output will help debug the problem.
Regards,
Hello again,
In case you didn't see it, step 20 does have this entry:
Regards,
Thanks for reply. I have checked curl on the same node as nginx pod is running and then it works. I thought it should work also from another pods (it worked that way whith flannel) especially when trying to use ClusterIP. I suppose this is correct behavior. Is it related with calico network policy? I am asking because I am curious why it is different from flannel.
From node1:
From node2:
One way to narrow down the problem is to run wireshark or tcpdump on the interfaces in use. If the curl request leaves the master node, but does not show up on the worker node you would know that something in the middle is blocking the traffic. If the traffic is arriving on the worker node this would indicate an issue with the service or perhaps kube-proxy on the target node.
As Fabio mentioned if you are using GCE you need to ensure that no traffic is being blocked. Other tools like VirtualBox and AWS have a slightly different manner to ensure all traffic is allowed.
Indeed the problem was with GCE firewall. I saw that the traffic comes out of the node1 but does not arrive to node2. In fact ping and HTTP ports were open so it worked for some scenarios but it does not work for others, eg. complete demo example from lab 4.3 was working partially as there were no images on the page until traffic was open on firewall settings in GCE.
Thanks for helping with this issue.
Hi,
I had the same problem (yes I think it IS a problem) and I solved it allowing ALL the traffic between the cluster nodes. If you are in GCE like me:
Let me know if you need more details on it.
Fabio
Thanks, I solved it allowing All traffic too , I want to know, why this action is necessary ?. before to this action I have allowed All TCP traffic but this didn't work !
Jorge.
Hi @jitapichab,
Kubernetes is an API driven architecture, and all internal communication takes place via API calls. This implies that each agent will expose a port where it expects requests from other agents.
A firewall rule to allow only TCP traffic will do just that - allow only the TCP traffic. Agents may use other protocols, so you would need to allow all protocols through, not only TCP.
This can be achieved in multiple ways. Tagging instances and setting up firewall rules based on tags is one way. Creating an isolated VPC network, dedicated to you Kubernetes nodes, with a firewall rule to allow all traffic - all protocols, to all ports, from all sources, is another way.
Regards.
-Chris