Lab 10.1

promagnoli · June 2020

When I "curl -H "Host: www.example.com" http://k8smaster/" I get a timeout. I can curl to nginx welcome page using NodePort ClusterIP. I tried to use master IP rather than alias, but with no success.

When I create the ingress rule I notice that the response is different from those reported in the lab doc.

My Lab:
$kubectl create -f ingress.rule.yaml
ingress.networking.k8s.io/ingress-test created

Lab Docs:
$ kubectl create -f ingress.rule.yaml
ingress.extensions "ingress-test" created

Here is a description of the ingress that is created:
$ kubectl describe ingresses ingress-test
Name: ingress-test
Namespace: default
Address:
Default backend: default-http-backend:80 ()
Rules:
Host Path Backends
---- ---- --------
www.example.com / secondapp:80 (192.168.131.30:80)

Annotations:
kubernetes.io/ingress.class: traefik
Events:

Since we create a NodePort service for the secondapp I would expect to see the NodePort ClusterIP in the backend, actually it's showing the deployment endpoint.

Any idea?

chrispokorni · June 2020

Hi @promagnoli,

A similar issue was reported earlier in the forum. Take a look at the earlier post, as your issue may be related.

https://forum.linuxfoundation.org/discussion/comment/24095#Comment_24095

Regards,
-Chris

promagnoli · July 2020

Hi @chrispokorni,

It took to me some time to follow up your feedback as I had no time to work on this.

I reviewed the post you mentioned, however the issue mentioned in the post is related to traefik version which is not my case. However, I used thread as a guide for troubleshooting. This led me to find a warning message in to traefik pods log "time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp".

My current situation is as follows:
1. nginx pod is running
2. nginx pod is running on the worker node and I can get nginx welcome page from worker node localhost
3. I cannot get nginx page from endpoint issuing the command from master (see below), but I can get it from worker node
4. I cannot get nginx page from Cluster IP issuing the commnad from both master and worker node
5. traefik pods are running
6. nginx pod is correctly selected by the service secondapp
7. the service secpondapp has an endpoint
8. the ingress is using the service as backend
9. the cluster role is in place
10. the role binding is in place

I don't know how to further troubleshoot that worning message. Any idea?

In the following post I am going to share deatils of my env and the 10 checks mentioned above.

Thanks
Paolo

promagnoli · July 2020

Here is my env summary:

$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 443/TCP 119d
secondapp NodePort 10.109.105.48 80:30382/TCP 17d

$ kubectl get ingress ingress-test
NAME HOSTS ADDRESS PORTS AGE
ingress-test www.example.com 80 17d

$ kubectl get ep
NAME ENDPOINTS AGE
kubernetes 10.0.0.141:6443,10.0.0.18:6443 119d
secondapp 192.168.131.27:80 17d

1. nginx pod is running

$ kubectl get po
NAME READY STATUS RESTARTS AGE
secondapp-5cf87c9f48-f9krw 1/1 Running 1 17d

2. nginx pod is running on the worker node and I can get nginx welcome page from worker node localhost

$ curl -H "Host: www.example.com" http://localhost/
<!DOCTYPE html>

Welcome to nginx!

body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; }

Welcome to nginx!

If you see this page, the nginx web server is successfully installed and working. Further configuration is required.

For online documentation and support please refer to nginx.org.
Commercial support is available at nginx.com.

Thank you for using nginx.

3. I cannot get nginx page from endpoint issuing the command from master (see below), but I can get it from worker node

$ curl -H "Host: www.example.com" http://192.168.131.27/ -v

Trying 192.168.131.27...
TCP_NODELAY set
connect to 192.168.131.27 port 80 failed: Connection timed out
Failed to connect to 192.168.131.27 port 80: Connection timed out
Closing connection 0
curl: (7) Failed to connect to 192.168.131.27 port 80: Connection timed out

4. I cannot get nginx page from Cluster IP issuing the commnad from both master and worker node

$ curl -H "Host: www.example.com" http://10.109.105.48:30382/ -v

Trying 10.109.105.48...
TCP_NODELAY set
connect to 10.109.105.48 port 30382 failed: Connection timed out
Failed to connect to 10.109.105.48 port 30382: Connection timed out
Closing connection 0
curl: (7) Failed to connect to 10.109.105.48 port 30382: Connection timed out

5. traefik pods are running
$ kubectl get po -n kube-system | grep traefik
traefik-ingress-controller-6tffr 1/1 Running 1 17d
traefik-ingress-controller-8xrg5 1/1 Running 1 17d

I can see a warning message in the traefik pods logs (same warning messages on both pods, reporting log just for a pod)

$ kubectl -n kube-system logs traefik-ingress-controller-8xrg5
time="2020-07-17T10:30:22Z" level=info msg="Traefik version v1.7.13 built on 2019-08-08_04:46:14PM"
time="2020-07-17T10:30:22Z" level=info msg="\nStats collection is disabled.\nHelp us improve Traefik by turning this feature on \nMore details on: https://docs.traefik.io/basics/#collected-data\n"
time="2020-07-17T10:30:22Z" level=info msg="Preparing server traefik &{Address::8080 TLS: Redirect: Auth: WhitelistSourceRange:[] WhiteList: Compress:false ProxyProtocol: ForwardedHeaders:0xc00031d120} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s"
time="2020-07-17T10:30:22Z" level=info msg="Preparing server http &{Address::80 TLS: Redirect: Auth: WhitelistSourceRange:[] WhiteList: Compress:false ProxyProtocol: ForwardedHeaders:0xc00031d100} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s"
time="2020-07-17T10:30:22Z" level=info msg="Starting provider configuration.ProviderAggregator {}"
time="2020-07-17T10:30:22Z" level=info msg="Starting server on :8080"
time="2020-07-17T10:30:22Z" level=info msg="Starting server on :80"
time="2020-07-17T10:30:22Z" level=info msg="Starting provider *kubernetes.Provider {\"Watch\":true,\"Filename\":\"\",\"Constraints\":[],\"Trace\":false,\"TemplateVersion\":0,\"DebugLogGeneratedTemplate\":false,\"Endpoint\":\"\",\"Token\":\"\",\"CertAuthFilePath\":\"\",\"DisablePassHostHeaders\":false,\"EnablePassTLSCert\":false,\"Namespaces\":null,\"LabelSelector\":\"\",\"IngressClass\":\"\",\"IngressEndpoint\":null}"
time="2020-07-17T10:30:22Z" level=info msg="ingress label selector is: \"\""
time="2020-07-17T10:30:22Z" level=info msg="Creating in-cluster Provider client"
E0717 10:30:52.998496 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0717 10:30:52.998606 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0717 10:30:53.026236 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1beta1.Ingress: Get https://10.96.0.1:443/apis/extensions/v1beta1/ingresses?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:54Z" level=info msg="Server configuration reloaded on :8080"
time="2020-07-17T10:30:54Z" level=info msg="Server configuration reloaded on :80"
time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:56Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:56Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:58Z" level=warning msg="Endpoints not available for default/secondapp"

6. nginx pod is correctly selected by the service secondapp

$ kubectl describe po secondapp-5cf87c9f48-f9krw
Name: secondapp-5cf87c9f48-f9krw
Namespace: default
Priority: 0
Node: ip-10-0-0-18/10.0.0.18
Start Time: Mon, 29 Jun 2020 13:55:18 +0000
Labels: app=secondapp <<<--- pod-template-hash=5cf87c9f48 Annotations: cni.projectcalico.org/podIP: 192.168.131.27/32 cni.projectcalico.org/podIPs: 192.168.131.27/32 Status: Running IP: 192.168.131.27 IPs: IP: 192.168.131.27 Controlled By: ReplicaSet/secondapp-5cf87c9f48 Containers: nginx: Container ID: docker://d7ccf81a9c7c2a906393a6e21a3c9f2f1f74a57b358a7965cf31c8ed66c90ef5 Image: nginx Image ID: docker-pullable://nginx@sha256:a93c8a0b0974c967aebe868a186e5c205f4d3bcb5423a56559f2f9599074bbcd Port: <none>
Host Port:
State: Running
Started: Fri, 17 Jul 2020 10:31:07 +0000
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 29 Jun 2020 13:55:19 +0000
Finished: Mon, 29 Jun 2020 16:30:47 +0000
Ready: True
Restart Count: 1
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-s2vxn (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-s2vxn:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-s2vxn
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:

promagnoli · July 2020

Hi @serewicz,

I am on AWS EC2 nodes, not not KVM. K8s nodes IP are 10.0.0.141 and 10.0.0.18, thus they are not overlapping with pod's IP (192.168.131.35).

I definitively agree with you that I am facing a network issue, however firewall is down on both K8s nodes and I am using super broad security group.

No idea.

Regards

promagnoli · July 2020

Let me add that I can test curl from the master node to the worker node IP, so I have the feeling that the issue could be related to calico as the troubles start when using the pod network. In the first calico pod-node I can see the following warning messages:

229-2020-07-17 10:30:26.477 [INFO][55] feature_detect.go 234: Looked up iptables command backendMode="legacy" candidates=[]string{"iptables-legacy-save", "iptables-save"} command="iptables-legacy-save" ipVersion=0x4 saveOrRestore="save"
230-2020-07-17 10:30:26.477 [INFO][55] int_dataplane.go 390: Checking if we need to clean up the VXLAN device

231:2020-07-17 10:30:26.477 [WARNING][55] int_dataplane.go 392: Failed to query VXLAN device error=Link not found

330-2020-07-17 10:30:26.638 [INFO][55] hostip_mgr.go 84: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"tunl0", Addrs:set.mapSet{}}
331-2020-07-17 10:30:26.638 [INFO][55] ipsets.go 119: Queueing IP set for creation family="inet" setID="this-host" setType="hash:ip"
332:2020-07-17 10:30:26.639 [WARNING][55] ipip_mgr.go 112: Failed to add IPIP tunnel device error=exit status 1
333:2020-07-17 10:30:26.639 [WARNING][55] ipip_mgr.go 89: Failed configure IPIP tunnel device, retrying... error=exit status 1

In the other pod-node I can see the following errors:

203-2020-07-17 10:30:25.421 [INFO][45] client.go 352: Calico Syncer has indicated it is in sync
204-2020-07-17 10:30:25.433 [INFO][45] resource.go 220: Target config /etc/calico/confd/config/bird6.cfg out of sync
205:2020-07-17 10:30:25.436 [ERROR][45] resource.go 288: Error from checkcmd "bird6 -p -c /etc/calico/confd/config/.bird6.cfg072659676": "bird: /etc/calico/confd/config/.bird6.cfg072659676:2:1 Unable to open included file /etc/calico/confd/config/bird6_aggr.cfg: No such file or directory\n

and the following warning messages:

225-2020-07-17 10:30:25.470 [INFO][46] feature_detect.go 234: Looked up iptables command backendMode="legacy" candidates=[]string{"iptables-legacy-save", "iptables-save"} command="iptables-legacy-save" ipVersion=0x4 saveOrRestore="save"
226-2020-07-17 10:30:25.470 [INFO][46] int_dataplane.go 390: Checking if we need to clean up the VXLAN device

227:2020-07-17 10:30:25.470 [WARNING][46] int_dataplane.go 392: Failed to query VXLAN device error=Link not found

313-2020-07-17 10:30:25.618 [INFO][46] async_calc_graph.go 135: AsyncCalcGraph running
314-2020-07-17 10:30:25.631 [INFO][46] daemon.go 631: No driver process to monitor
315:2020-07-17 10:30:25.634 [WARNING][46] ipip_mgr.go 112: Failed to add IPIP tunnel device error=exit status 1
316:2020-07-17 10:30:25.634 [WARNING][46] ipip_mgr.go 89: Failed configure IPIP tunnel device, retrying... error=exit status 1

Can be this connected to the fact that my calico.yaml has the IPV4 pool section commented out:

# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within --cluster-cidr.
# - name: CALICO_IPV4POOL_CIDR
# value: "192.168.0.0/16"

promagnoli · July 2020

Here is my env summary:

$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 443/TCP 119d
secondapp NodePort 10.109.105.48 80:30382/TCP 17d

$ kubectl get ingress ingress-test
NAME HOSTS ADDRESS PORTS AGE
ingress-test www.example.com 80 17d

$ kubectl get ep
NAME ENDPOINTS AGE
kubernetes 10.0.0.141:6443,10.0.0.18:6443 119d
secondapp 192.168.131.27:80 17d

1. nginx pod is running

$ kubectl get po
NAME READY STATUS RESTARTS AGE
secondapp-5cf87c9f48-f9krw 1/1 Running 1 17d

2. nginx pod is running on the worker node and I can get nginx welcome page from worker node localhost

$ curl -H "Host: www.example.com" http://localhost/
<!DOCTYPE html>

Welcome to nginx!

body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; }

Welcome to nginx!

If you see this page, the nginx web server is successfully installed and working. Further configuration is required.

For online documentation and support please refer to nginx.org.
Commercial support is available at nginx.com.

Thank you for using nginx.

** 3. I cannot get nginx page from endpoint issuing the command from master (see below), but I can get it from worker node**

$ curl -H "Host: www.example.com" http://192.168.131.27/ -v

Trying 192.168.131.27...
TCP_NODELAY set
connect to 192.168.131.27 port 80 failed: Connection timed out
Failed to connect to 192.168.131.27 port 80: Connection timed out
Closing connection 0
curl: (7) Failed to connect to 192.168.131.27 port 80: Connection timed out

4. I cannot get nginx page from Cluster IP issuing the commnad from both master and worker node

$ curl -H "Host: www.example.com" http://10.109.105.48:30382/ -v

Trying 10.109.105.48...
TCP_NODELAY set
connect to 10.109.105.48 port 30382 failed: Connection timed out
Failed to connect to 10.109.105.48 port 30382: Connection timed out
Closing connection 0
curl: (7) Failed to connect to 10.109.105.48 port 30382: Connection timed out

5. traefik pods are running
$ kubectl get po -n kube-system | grep traefik
traefik-ingress-controller-6tffr 1/1 Running 1 17d
traefik-ingress-controller-8xrg5 1/1 Running 1 17d

I can see a warning message in the traefik pods logs (same warning messages on both pods, reporting log just for a pod)

$ kubectl -n kube-system logs traefik-ingress-controller-8xrg5
time="2020-07-17T10:30:22Z" level=info msg="Traefik version v1.7.13 built on 2019-08-08_04:46:14PM"
time="2020-07-17T10:30:22Z" level=info msg="\nStats collection is disabled.\nHelp us improve Traefik by turning this feature on \nMore details on: https://docs.traefik.io/basics/#collected-data\n"
time="2020-07-17T10:30:22Z" level=info msg="Preparing server traefik &{Address::8080 TLS: Redirect: Auth: WhitelistSourceRange:[] WhiteList: Compress:false ProxyProtocol: ForwardedHeaders:0xc00031d120} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s"
time="2020-07-17T10:30:22Z" level=info msg="Preparing server http &{Address::80 TLS: Redirect: Auth: WhitelistSourceRange:[] WhiteList: Compress:false ProxyProtocol: ForwardedHeaders:0xc00031d100} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s"
time="2020-07-17T10:30:22Z" level=info msg="Starting provider configuration.ProviderAggregator {}"
time="2020-07-17T10:30:22Z" level=info msg="Starting server on :8080"
time="2020-07-17T10:30:22Z" level=info msg="Starting server on :80"
time="2020-07-17T10:30:22Z" level=info msg="Starting provider *kubernetes.Provider {\"Watch\":true,\"Filename\":\"\",\"Constraints\":[],\"Trace\":false,\"TemplateVersion\":0,\"DebugLogGeneratedTemplate\":false,\"Endpoint\":\"\",\"Token\":\"\",\"CertAuthFilePath\":\"\",\"DisablePassHostHeaders\":false,\"EnablePassTLSCert\":false,\"Namespaces\":null,\"LabelSelector\":\"\",\"IngressClass\":\"\",\"IngressEndpoint\":null}"
time="2020-07-17T10:30:22Z" level=info msg="ingress label selector is: \"\""
time="2020-07-17T10:30:22Z" level=info msg="Creating in-cluster Provider client"
E0717 10:30:52.998496 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0717 10:30:52.998606 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0717 10:30:53.026236 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1beta1.Ingress: Get https://10.96.0.1:443/apis/extensions/v1beta1/ingresses?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:54Z" level=info msg="Server configuration reloaded on :8080"
time="2020-07-17T10:30:54Z" level=info msg="Server configuration reloaded on :80"
time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:54Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:56Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:56Z" level=warning msg="Endpoints not available for default/secondapp"
time="2020-07-17T10:30:58Z" level=warning msg="Endpoints not available for default/secondapp"

6. nginx pod is correctly selected by the service secondapp

$ kubectl describe po secondapp-5cf87c9f48-f9krw
Name: secondapp-5cf87c9f48-f9krw
Namespace: default
Priority: 0
Node: ip-10-0-0-18/10.0.0.18
Start Time: Mon, 29 Jun 2020 13:55:18 +0000
Labels: app=secondapp <<<--- pod-template-hash=5cf87c9f48 Annotations: cni.projectcalico.org/podIP: 192.168.131.27/32 cni.projectcalico.org/podIPs: 192.168.131.27/32 Status: Running IP: 192.168.131.27 IPs: IP: 192.168.131.27 Controlled By: ReplicaSet/secondapp-5cf87c9f48 Containers: nginx: Container ID: docker://d7ccf81a9c7c2a906393a6e21a3c9f2f1f74a57b358a7965cf31c8ed66c90ef5 Image: nginx Image ID: docker-pullable://nginx@sha256:a93c8a0b0974c967aebe868a186e5c205f4d3bcb5423a56559f2f9599074bbcd Port: <none>
Host Port:
State: Running
Started: Fri, 17 Jul 2020 10:31:07 +0000
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 29 Jun 2020 13:55:19 +0000
Finished: Mon, 29 Jun 2020 16:30:47 +0000
Ready: True
Restart Count: 1
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-s2vxn (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-s2vxn:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-s2vxn
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:

promagnoli · July 2020

7. the service secpondapp has an endpoint

$ kubectl describe svc secondapp
Name: secondapp
Namespace: default
Labels: app=secondapp <<<--- Annotations: <none>
Selector: app=secondapp <<<--- Type: NodePort IP: 10.109.105.48 Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: 30382/TCP
Endpoints: 192.168.131.27:80
Session Affinity: None
External Traffic Policy: Cluster
Events:

8. the ingress is using the service secondapp as backend

$ kubectl describe ingress ingress-test
Name: ingress-test
Namespace: default
Address:
Default backend: default-http-backend:80 ()
Rules:
Host Path Backends
---- ---- --------
www.example.com
/ secondapp:80 (192.168.131.27:80)
Annotations:
kubernetes.io/ingress.class: traefik
Events:

9. the cluster role is in place

$ kubectl describe clusterroles.rbac.authorization.k8s.io traefik-ingress-controller
Name: traefik-ingress-controller
Labels:
Annotations:
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
endpoints [] [] [get list watch]
secrets [] [] [get list watch]
services [] [] [get list watch]
ingresses.extensions [] [] [get list watch]

10. the role binding is in place
$ kubectl describe clusterrolebindings.rbac.authorization.k8s.io traefik-ingress-controller
Name: traefik-ingress-controller
Labels:
Annotations:
Role:
Kind: ClusterRole
Name: traefik-ingress-controller
Subjects:
Kind Name Namespace
---- ---- ---------
ServiceAccount traefik-ingress-controller kube-system

chrispokorni · July 2020

Hi @promagnoli,

The issue you are seeing is related to node networking, which has to be fixed on your AWS EC2 infrastructure. Kubernetes is sensitive to misconfigured node networking, but it has no control over that configuration. AWS VPCs may have default rules to block some traffic. The SG rule suggested by @serewicz, "All traffic All All 0.0.0.0/0" should fix your issue. SG rules that allow only specific traffic may not be enough for all the protocols that are used by Kubernetes and all plugins it uses.

Regards,
-Chris

promagnoli · July 2020

@serewicz my posts where queued as I had to split my original large post into multiple and it seems this platform doesn't allow multiple posts.

By the way, changing the SG inbound rule to "All traffic All All 0.0.0.0/0" fixed the issue. thanks to both @serewicz and @chrispokorni for you help and patience. It is really appreciated here.

ganeshahv · August 2020

Hi @serewicz, I am facing exactly the same problem described in this discussion. I am using EC2 instances and have confirmed that the SG rules are set to allow all traffic and IP being 0.0.0.0/0. So, I should not be seeing networking issues hopefully.

The details of the setup follow:

On the master node

ubuntu@ip-172-31-32-211:~$ kubectl describe ingress ingress-test
Name:             ingress-test
Namespace:        default
Address:
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
  Host          Path  Backends
  ----          ----  --------
  www.gani.com
                /   secondapp:80 (192.168.89.15:80)
Annotations:    kubernetes.io/ingress.class: traefik
Events:         <none>
ubuntu@ip-172-31-32-211:~$ kubectl describe svc secondapp
Name:                     secondapp
Namespace:                default
Labels:                   app=secondapp
Annotations:              <none>
Selector:                 app=secondapp
Type:                     NodePort
IP:                       10.105.176.35
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  31595/TCP
Endpoints:                192.168.89.15:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
ubuntu@ip-172-31-32-211:~$ kubectl get ep
NAME         ENDPOINTS            AGE
kubernetes   172.31.32.211:6443   15d
nginx        <none>               9d
secondapp    192.168.89.15:80     132m

ubuntu@ip-172-31-32-211:~$ kubectl logs traefik-ingress-controller-xv5pl -n kube-system
time="2020-08-28T16:45:13Z" level=info msg="Traefik version v1.7.13 built on 2019-08-08_04:46:14PM"
time="2020-08-28T16:45:13Z" level=info msg="\nStats collection is disabled.\nHelp us improve Traefik by turning this feature on :)\nMore details on: https://docs.traefik.io/basics/#collected-data\n"
time="2020-08-28T16:45:13Z" level=info msg="Preparing server traefik &{Address::8080 TLS:<nil> Redirect:<nil> Auth:<nil> WhitelistSourceRange:[] WhiteList:<nil> Compress:false ProxyProtocol:<nil> ForwardedHeaders:0xc00089f000} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s"
time="2020-08-28T16:45:13Z" level=info msg="Preparing server http &{Address::80 TLS:<nil> Redirect:<nil> Auth:<nil> WhitelistSourceRange:[] WhiteList:<nil> Compress:false ProxyProtocol:<nil> ForwardedHeaders:0xc00089efe0} with readTimeout=0s writeTimeout=0s idleTimeout=3m0s"
time="2020-08-28T16:45:13Z" level=info msg="Starting provider configuration.ProviderAggregator {}"
time="2020-08-28T16:45:13Z" level=info msg="Starting server on :8080"
time="2020-08-28T16:45:13Z" level=info msg="Starting server on :80"
time="2020-08-28T16:45:13Z" level=info msg="Starting provider *kubernetes.Provider {\"Watch\":true,\"Filename\":\"\",\"Constraints\":[],\"Trace\":false,\"TemplateVersion\":0,\"DebugLogGeneratedTemplate\":false,\"Endpoint\":\"\",\"Token\":\"\",\"CertAuthFilePath\":\"\",\"DisablePassHostHeaders\":false,\"EnablePassTLSCert\":false,\"Namespaces\":null,\"LabelSelector\":\"\",\"IngressClass\":\"\",\"IngressEndpoint\":null}"
time="2020-08-28T16:45:13Z" level=info msg="ingress label selector is: \"\""
time="2020-08-28T16:45:13Z" level=info msg="Creating in-cluster Provider client"
time="2020-08-28T16:45:13Z" level=info msg="Server configuration reloaded on :80"
time="2020-08-28T16:45:13Z" level=info msg="Server configuration reloaded on :8080"
time="2020-08-28T16:55:15Z" level=warning msg="A new release has been found: 2.2.8. Please consider updating."
time="2020-08-28T17:01:09Z" level=info msg="Server configuration reloaded on :80"
time="2020-08-28T17:01:09Z" level=info msg="Server configuration reloaded on :8080"
time="2020-08-28T17:25:08Z" level=info msg="Server configuration reloaded on :80"
time="2020-08-28T17:25:08Z" level=info msg="Server configuration reloaded on :8080"

It works when I use the ClusterIP

```ubuntu@ip-172-31-32-211:~$ curl -H "Host: www.gani.com" http://10.105.176.35:80
<!DOCTYPE html>

Welcome to nginx!

body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}

Welcome to nginx!

If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.

For online documentation and support please refer to
nginx.org.

Commercial support is available at
nginx.com.

Thank you for using nginx.

ubuntu@ip-172-31-32-211:~$ curl -H "Host: www.gani.com" http://k8smaster
curl: (7) Failed to connect to k8smaster port 80: Connection refused
ubuntu@ip-172-31-32-211:~$ curl -H "Host: www.gani.com" http://172.31.32.211
curl: (7) Failed to connect to 172.31.32.211 port 80: Connection refused
ubuntu@ip-172-31-32-211:~$

ubuntu@ip-172-31-38-235:~$ curl -H "Host: www.gani.com" http://172.31.32.211/
curl: (7) Failed to connect to 172.31.32.211 port 80: Connection refused
ubuntu@ip-172-31-38-235:~$ curl -H "Host: www.gani.com" http://172.31.38.235/
<!DOCTYPE html>

Welcome to nginx!

body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}

Welcome to nginx!

If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.

For online documentation and support please refer to
nginx.org.

Commercial support is available at
nginx.com.

Thank you for using nginx.

ubuntu@ip-172-31-38-235:~$
```

Please help.

ganeshahv · August 2020

As a follow-up to the earlier post, I see that the curl gets a 200 OK when I use the worker node's public IP, not the master node's.

ubuntu@ip-172-31-32-211:~$ curl  -H "Host: www.giri.com" http://<master_node_public_IP> 
curl: (7) Failed to connect to 13.235.214.225 port 80: Connection refused
ubuntu@ip-172-31-32-211:~$ curl  -H "Host: www.giri.com" http://<worker_node_public_IP>
<!DOCTYPE html>
<html>
<head>
<title>Third Page</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
ubuntu@ip-172-31-32-211:~$

chrispokorni · August 2020

Hi @ganeshahv,

The Kubernetes cluster behavior you are describing is consistent with networking issues between your cluster nodes, which is managed at the cloud infrastructure level. Unfortunately, Kubernetes has no control over its infrastructure.

The following are normal/expected Kubernetes cluster behaviors:
1. The ingress should be accessible thru both nodes Public and Private IP addresses.
2. Node Port type Services should be accessible thru both nodes Public and Private IP addresses, regardless of the Pods' location in the cluster.
3. The ClusterIP type Service should be accessible from both nodes, regardless of the Pods' location in the cluster.

On AWS I would recommend starting with a new VPC with a new all-open/allow-all SG, then the two EC2 instances provisioned in that VPC.

Regards,
-Chris

ganeshahv · August 2020

Hi @chrispokorni,

Thank you for the response.

I think I have nailed down the issue to this,but unable to continue.

Master

ubuntu@ip-172-31-32-211:~$ curl http://127.0.0.1
curl: (7) Failed to connect to 127.0.0.1 port 80: Connection refused

Worker

ubuntu@ip-172-31-38-235:~$ curl http://127.0.0.1
404 page not found

ganeshahv · August 2020

Hi @chrispokorni, @serewicz, I was able to resolce the issue.

The master node was tainted and hence I could not launch the ingress controller pod on it.

ubuntu@ip-172-31-32-211:~$     kubectl describe nodes | grep -i taint
Taints:             node-role.kubernetes.io/master:NoSchedule

I resolved it removing the taint on it and things started working fine.

ubuntu@ip-172-31-32-211:~$ curl  -H "Host: www.shourya.com" http://k8smaster
<!DOCTYPE html>
<Output ommitted>

Thanks to @dzhigalin for his inputs.

Lab 10.1

Comments

Welcome to nginx!

231:2020-07-17 10:30:26.477 [WARNING][55] int_dataplane.go 392: Failed to query VXLAN device error=Link not found

227:2020-07-17 10:30:25.470 [WARNING][46] int_dataplane.go 392: Failed to query VXLAN device error=Link not found

Welcome to nginx!

Welcome to nginx!

Welcome to nginx!

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)