New training version with calico does not work [LAB 3.3]

In Lab 3.3 at step 20 there is no response from curl method (after some time it returns timeout). It worked with flannel. With calico it seems to block connection to the both Cluster IP and Enpoint IP. In addition ping from master node to worker node works but there is timeout trying to ping Cluster IP or Endpoint IP from master node (with flannel it worked).  All steps before were done according to the lab instruction.

Any idea what could be wrong?


    Thank you for the feedback. Would it be possible to paste the last few commands, and their output? As you mentioned it hangs, but perhaps something in that and the previous output will help debug the problem.


    In case you didn't see it, step 20 does have this entry: 

    ...If the curl command times out the pod may be running on the other node. Run the same command on that node and it should work.



    Thanks for reply. I have checked curl on the same node as nginx pod is running and then it works. I thought it should work also from another pods (it worked that way whith flannel) especially when trying to use ClusterIP. I suppose this is correct behavior. Is it related with calico network policy? I am asking because I am curious why it is different from flannel.

    From node1:

    [email protected]:~$ kubectl get svc,ep NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP <none> 443/TCP 8h service/nginx ClusterIP <none> 80/TCP 8h NAME ENDPOINTS AGE endpoints/kubernetes 8h endpoints/nginx 8h [email protected]:~$ curl curl: (7) Failed to connect to port 80: Connection timed out

    From node2:

    [email protected]:~$ curl <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>


    One way to narrow down the problem is to run wireshark or tcpdump on the interfaces in use. If the curl request leaves the master node, but does not show up on the worker node you would know that something in the middle is blocking the traffic. If the traffic is arriving on the worker node this would indicate an issue with the service or perhaps kube-proxy on the target node. 

    As Fabio mentioned if you are using GCE you need to ensure that no traffic is being blocked. Other tools like VirtualBox and AWS have a slightly different manner to ensure all traffic is allowed.

    Indeed the problem was with GCE firewall. I saw that the traffic comes out of the node1 but does not arrive to node2. In fact ping and HTTP ports were open so it worked for some scenarios but it does not work for others, eg. complete demo example from lab 4.3 was working partially as there were no images on the page until traffic was open on firewall settings in GCE.

    Thanks for helping with this issue.

    I had the same problem (yes I think it IS a problem) and I solved it allowing ALL the traffic between the cluster nodes. If you are in GCE like me:

    1. Tag all the hosts of the cluster with the same network tag, say "k8s".
    2. Add a firewall rule to allow all the traffic from hosts tagged "k8s" to hosts tagged "k8s".

    Let me know if you need more details on it.


