Welcome to the Linux Foundation Forum!

Lab 10.1 Not Working as Described

Once I install traefik I can see the dashboard from other nodes and remote systems but not the master. I am also unable to get

curl -H "Host: thirdpage.org" http://k8smaster/

to work from anywhere master, worker, or outside completely.

Comments

  • My ingress pods are working. I have already run into the overlap issue and rebuilt the cluster using /24s instead of /16s so the previous posts don't seem to apply here. I have double checked that my firewall is not running. Still getting

    curl: (7) Failed to connect to k8smaster port 80: Connection refused

  • I started looking at the master node. When I run netstat -nat neither port 80 or 8080 are open on any of the interfaces so what ever is supposed to proxy that curl is not running on my master.

  • Hi @recentcoin,

    Between steps 11-15 did you see any discrepancies?

    There are two levels of connectivity that need to be properly configured: ingress to service, and service to pod. I would start the troubleshooting from the pod: is it running, can you curl the pod directly on its IP address? The continue with the service: do you have a service IP and the endpoint populated with the pod IP? Can you curl the service IP and receive a response from its endpoint? Then the ingress rule configuration needs to match the service information: the serviceName and the servicePort of the ingress to match the service's name and its port value.

    Regards,
    -Chris

  • I am on step 10 where I should be able to curl and its not working so I don't want to go past that.

    root@k8smaster:~# netstat -nat|grep 80
    tcp 0 0 192.168.0.150:2380 0.0.0.0:* LISTEN
    tcp 0 0 127.0.0.1:2379 127.0.0.1:43580 ESTABLISHED
    tcp 0 0 127.0.0.1:40806 127.0.0.1:9099 TIME_WAIT
    tcp 0 0 127.0.0.1:43280 127.0.0.1:2379 ESTABLISHED
    tcp 0 0 127.0.0.1:2379 127.0.0.1:43280 ESTABLISHED
    tcp 0 0 127.0.0.1:43380 127.0.0.1:2379 ESTABLISHED
    tcp 0 0 127.0.0.1:2379 127.0.0.1:43380 ESTABLISHED
    tcp 0 0 127.0.0.1:43580 127.0.0.1:2379 ESTABLISHED
    tcp6 0 0 192.168.0.150:6443 192.168.0.151:8180 ESTABLISHED

  • root@k8smaster:~# more /etc/hosts
    127.0.0.1 localhost
    192.168.0.150 k8smaster
    192.168.0.151 ubuntu2
    192.168.0.152 ubuntu3

    The following lines are desirable for IPv6 capable hosts

    ::1 localhost ip6-localhost ip6-loopback
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters

  • root@k8smaster:~# ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
    valid_lft forever preferred_lft forever
    2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:ea:4e:cc brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.150/24 brd 192.168.0.255 scope global enp1s0
    valid_lft forever preferred_lft forever
    inet6 fec0::5054:ff:feea:4ecc/64 scope site dynamic mngtmpaddr noprefixroute
    valid_lft 2591343sec preferred_lft 604143sec
    inet6 fe80::5054:ff:feea:4ecc/64 scope link
    valid_lft forever preferred_lft forever
    3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:bd:4e:90:02 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
    valid_lft forever preferred_lft forever
    6: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 192.168.1.128/32 brd 192.168.1.128 scope global tunl0
    valid_lft forever preferred_lft forever

  • If I curl to the work nodes, I get a 404 error which means I am at least hitting the nginix web server.

  • recentcoin
    recentcoin Posts: 21
    edited September 2020

    root@k8smaster:~# curl -H "Host: www.example.com" http://192.168.0.151
    404 page not found

    root@k8smaster:~# curl -H "Host: www.example.com" http://192.168.0.152
    404 page not found

    I removed and reapplied the ingress rules and now its hitting the page but I still get connection denied when I try to hit the master.

  • jimit@ubuntu2:~$ curl -H "Host: www.example.com" http://192.168.0.150
    curl: (7) Failed to connect to 192.168.0.150 port 80: Connection refused
    jimit@ubuntu2:~$ curl -H "Host: www.example.com" http://192.168.0.151
    <!DOCTYPE html>


    Welcome to nginx!

    body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; }


    Welcome to nginx!

    If you see this page, the nginx web server is successfully installed and working. Further configuration is required.

    For online documentation and support please refer to nginx.org.
    Commercial support is available at nginx.com.

    Thank you for using nginx.



    jimit@ubuntu2:~$ curl -H "Host: www.example.com" http://k8smaster
    curl: (7) Failed to connect to k8smaster port 80: Connection refused

  • From the first posting above it seemed that only step 16 did not work.

    In this case, where example.com does not work either, I would investigate the same 2 connectivity layers between the secondapp pod, the service, and the ingress rule for example.com. Specifically look at the Pod's label to match the Service's selector, Pod's containerPort to match the Service' targetPort, the Ingress serviceName to match the Service's name, and the Ingress servicePort to match the Service port.

    What is your Pod network? And what is your hosts/nodes network?

    Regards,
    -Chris

  • chrispokorni
    chrispokorni Posts: 2,346
    edited September 2020

    Do you have k8smaster alias set in the /etc/hosts of ubuntu2 node as well?

    When curling the IP address instead of k8smaster alias, do you see a different response?

  • Pod network is 192.16.1.0/24
    Master and workers on are 192.168.0.0/24
    Master 192.168.0.150
    Ubuntu2 worker 192.168.0.151
    Ubuntu3 woker 192.168.0.150

    I am using the YAML files that came with the class. I can post them.

  • root@k8smaster:~# kubectl get pods
    NAME READY STATUS RESTARTS AGE
    secondapp-959796d85-pht4p 1/1 Running 0 40m

    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    kubernetes ClusterIP 10.96.0.1 443/TCP 6d23h
    nginx ClusterIP 10.108.88.13 80/TCP 4h50m
    secondapp NodePort 10.108.117.87 80:30841/TCP 39m
    service-lab NodePort 10.106.94.144 8080:32215/TCP 44h

  • root@k8smaster:~# more ingress.rule.yaml
    apiVersion: networking.k8s.io/v1beta1
    kind: Ingress
    metadata:
    name: ingress-test
    annotations:
    kubernetes.io/ingress.class: traefik
    spec:
    rules:
    - host: www.example.com
    http:
    paths:
    - backend:
    serviceName: secondapp
    servicePort: 80
    path: /

    root@k8smaster:~# kubectl get ingress
    Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
    NAME CLASS HOSTS ADDRESS PORTS AGE
    ingress-test www.example.com 80 16m

    root@k8smaster:~# kubectl get services
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    kubernetes ClusterIP 10.96.0.1 443/TCP 6d23h
    nginx ClusterIP 10.108.88.13 80/TCP 4h54m
    secondapp NodePort 10.108.117.87 80:30841/TCP 43m
    service-lab NodePort 10.106.94.144 8080:32215/TCP 44h

    root@k8smaster:~# more ingress.rbac.yaml
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
    name: traefik-ingress-controller
    rules:
    - apiGroups:
    - ""
    resources:
    - services
    - endpoints
    - secrets
    verbs:
    - get
    - list
    - watch
    - apiGroups:
    - extensions
    resources:
    - ingresses
    verbs:
    - get
    - list

    - watch

    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
    name: traefik-ingress-controller
    roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: traefik-ingress-controller
    subjects:

    • kind: ServiceAccount
      name: traefik-ingress-controller
      namespace: kube-system

    root@k8smaster:~# more traefik-ds.yaml

    apiVersion: v1
    kind: ServiceAccount
    metadata:
    name: traefik-ingress-controller

    namespace: kube-system

    kind: DaemonSet
    apiVersion: apps/v1
    metadata:
    name: traefik-ingress-controller
    namespace: kube-system
    labels:
    k8s-app: traefik-ingress-lb
    spec:
    selector:
    matchLabels:
    name: traefik-ingress-lb
    template:
    metadata:
    labels:
    k8s-app: traefik-ingress-lb
    name: traefik-ingress-lb
    spec:
    serviceAccountName: traefik-ingress-controller
    terminationGracePeriodSeconds: 60
    hostNetwork: True
    containers:
    - image: traefik:1.7.13
    name: traefik-ingress-lb
    ports:
    - name: http
    containerPort: 80
    hostPort: 80
    - name: admin
    containerPort: 8080
    hostPort: 8080
    args:
    - --api
    - --kubernetes

    - --logLevel=INFO

    kind: Service
    apiVersion: v1
    metadata:
    name: traefik-ingress-service
    namespace: kube-system
    spec:
    selector:
    k8s-app: traefik-ingress-lb
    ports:
    - protocol: TCP
    port: 80
    name: web
    - protocol: TCP
    port: 8080
    name: admin

  • root@k8smaster:~# kubectl describe service
    Name: kubernetes
    Namespace: default
    Labels: component=apiserver
    provider=kubernetes
    Annotations:
    Selector:
    Type: ClusterIP
    IP: 10.96.0.1
    Port: https 443/TCP
    TargetPort: 6443/TCP
    Endpoints: 192.168.0.150:6443
    Session Affinity: None
    Events:

    Name: nginx
    Namespace: default
    Labels: app=nginx
    Annotations:
    Selector: app=nginx
    Type: ClusterIP
    IP: 10.108.88.13
    Port: 80/TCP
    TargetPort: 80/TCP
    Endpoints:
    Session Affinity: None
    Events:

    Name: secondapp
    Namespace: default
    Labels: app=secondapp
    Annotations:
    Selector: app=secondapp
    Type: NodePort
    IP: 10.108.117.87
    Port: 80/TCP
    TargetPort: 80/TCP
    NodePort: 30841/TCP
    Endpoints: 192.168.1.114:80
    Session Affinity: None
    External Traffic Policy: Cluster
    Events:

    Name: service-lab
    Namespace: default
    Labels: system=secondary
    Annotations:
    Selector: system=secondary
    Type: NodePort
    IP: 10.106.94.144
    Port: 8080/TCP
    TargetPort: 8080/TCP
    NodePort: 32215/TCP
    Endpoints:
    Session Affinity: None
    External Traffic Policy: Cluster
    Events:

  • root@k8smaster:~# curl -H "Host: www.example.com" http://192.168.0.150:30841
    <!DOCTYPE html>


    Welcome to nginx!

    body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; }


    Welcome to nginx!

    If you see this page, the nginx web server is successfully installed and working. Further configuration is required.

    For online documentation and support please refer to nginx.org.
    Commercial support is available at nginx.com.

    Thank you for using nginx.


  • Oddly, that works when I curl the master on that port but neither 80 nor 8080 are open on the master so the curl to k8smaster fails

  • You are running kubectl as root. That is not how the lab guide is setup. kubectl should be run by a user (non-root).

    The k8smaster was intended to be an alias, not necessarily the hostname of your master node. The alias was a preparation for the HA lab exercises in Ch 16.

  • I tried it not as root too and that doesn't work either. I am about ready to just ask for a refund. This has been nothing but a giant headache.

  • coop
    coop Posts: 916

    @recentcoin: You just posted 11 messages in 20 minutes. I appreciate how Chris has been able to help you in real time, but this is not intended to be a messaging venue. There are those of us who monitor all traffic on the forums and are being somewhat overloaded. We usually recommend people save up their comments and try to proceed further on their own before jumping into the forum.

    While moderators are often able to respond immediately, we all try, sometimes we just absolutely cannot due to other things we are doing. So the normal response time is often in hours not minutes, and we don't want participants to think they will always get such prompt answers and help.

    I say this without trying to tell you not to ask for help, we are quite happy to do so , it is the moderator's job (as well as other students). But please lighten up on the "post" button if that is at all possible.

    Thanks

  • serewicz
    serewicz Posts: 1,000

    Hello,

    It is difficult to tell what you are trying to do when you post so much information without context. Let us go slowly and try to troubleshoot the issue.

    What, exactly, is the step that is not working?

    What is the network settings of the primary interface on your master node?

    What is your pod/Calico IPv4 pool network settings?

    Does your master node have only one interface?

    What are you using to run the labs, GCE, AWS, spare laptops, VirtualBox?

    Have you made sure there are no firewalls between your nodes?

    Let us start with these questions and then we can work together to try to troubleshoot the issue.

    Regards,

  • All i know is that its not working as it was described in the document and it does not matter if I run it as root or not. I can post you the same commands run as an unprivileged user. It still does not work. I tried it as root to see if using root would resolve some permissions issue. It does not. The problem persists. There is not enough of explanation in any of the course materials to be materially helpful in troubleshooting when something is wrong so I am stuck with a forum.

  • So I take it that you don't know why its not working either.

  • Hi @recentcoin,

    The steps in the lab guide work on an environment setup in the same manner as the lab's environment. Once the config steps are modified to fit a particular hardware scenario (3 nodes instead of 2, dedicated master node instead of a shared master, ), and the cluster is configured differently than the one used for the lab guide, then different behavior is expected with some resources.

    When working with a complex open source technology as Kubernetes, it is not always a simple fix to a reported problem. Troubleshooting tends to be more complex, and understanding the cluster's setup is one of the first things needed in order to help with the troubleshooting process. Different environments introduce different unknowns which could possibly impact the behavior of your Kubernetes objects. So before the moment of knowing for sure what the issues is, there is a troubleshooting process that leads to that moment.

    Your patience and collaboration are appreciated while helping to troubleshoot the issue.

    Regards,
    -Chris

  • @recentcoin you might have solved this already but this might be helpful for other people running into 404s or other issues. I had the same issue and when I looked at the traefik-ingress-controller logs e.g. kc logs -n kube-system traefik-ingress-controller-ccrxj I saw a bunch of these messages:

    E1007 06:04:08.508945       1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:traefik-ingress-controller" cannot list resource "endpoints" in API group "" at the cluster scope
    

    Turned out I had a typo in the ingress controller's cluster role and cluster role binding.

  • chrispokorni
    chrispokorni Posts: 2,346

    Hi @beatrichartz,

    Such typos are common. Others may be found between the Pod labels and the Service selectors, Ingress service name or service port and the actual Service name or its port, Service target port and Pod container port, missing an expected annotation on the Ingress object...

    Regards,
    -Chris

  • Guty
    Guty Posts: 1

    @beatrichartz You really saved my day! Thanx a lot for your comment. :D

  • cjmills
    cjmills Posts: 10

    @beatrichartz said:
    @recentcoin you might have solved this already but this might be helpful for other people running into 404s or other issues. I had the same issue and when I looked at the traefik-ingress-controller logs e.g. kc logs -n kube-system traefik-ingress-controller-ccrxj I saw a bunch of these messages:

    E1007 06:04:08.508945       1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:traefik-ingress-controller" cannot list resource "endpoints" in API group "" at the cluster scope
    

    Turned out I had a typo in the ingress controller's cluster role and cluster role binding.

    You are a legend ... just spent at least an hour struggling with this and came across this simple, yet effective, solution that pointed me in the right direction.

Categories

Upcoming Training