Welcome to the Linux Foundation Forum!

Lab 7.2 <error: endpoints "default-http-backend" not found>

chrischia
chrischia Posts: 12
edited January 2021 in LFD259 Class Forum

Hi,
I was doing my exercise 7.2 and failed at Step 8
curl -H "Host: www.example.com" http://10.128.0.7/

I did my check by running:
kubectl describe ing ingress-test -n default

Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
Name:             ingress-test
Namespace:        default
Address:          
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
  Host             Path  Backends
  ----             ----  --------
  www.example.com  
                   /   secondapp:80   192.168.235.153:80)
Annotations:       kubernetes.io/ingress.class: traefik
Events:            <none>

I also check the svc in all namespaces:
kubectl get svc --all-namespaces

NAMESPACE     NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes                ClusterIP      10.96.0.1        <none>        443/TCP                  19d
default       nginx                     ClusterIP      10.108.189.140   <none>        443/TCP                  17d
default       registry                  ClusterIP      10.96.74.244     <none>        5000/TCP                 17d
default       secondapp                 LoadBalancer   10.100.192.227   <pending>     80:32000/TCP             96m
kube-system   kube-dns                  ClusterIP      10.96.0.10       <none>        53/UDP,53/TCP,9153/TCP   19d
kube-system   traefik-ingress-service   ClusterIP      10.110.163.198   <none>        80/TCP,8080/TCP          35m
multitenant   shopping                  NodePort       10.110.156.71    <none>        80:30381/TCP             4d4h

How do i create default-http-backend svc? Is it a simple svc code with port 80?
fyi, I am using GCP.

Comments

  • chrischia
    chrischia Posts: 12
    edited January 2021

    ok i solved the problem by installing a new traefik and configuring ingress controller again using the following guide:
    https://doc.traefik.io/traefik/v1.7/user-guide/kubernetes/

    the error endpoints "default-http-backend" remains unfound but the code
    curl -H "Host: www.example.com" http://10.2.0.6/

    did return 404 page not found.

  • serewicz
    serewicz Posts: 1,000

    Hello,

    Glad you were able to get it working. The 404 error is an indication the ingress controller is working but is not able to associate traffic with an existing service. The most common issue I find is a typo with the service or the pod labels. Should you revisit the lab I would work out from using the pod IP to view the default web page. Then use the service IP, and finally the ingress controller.

    Regards,

  • NAME            READY   STATUS    RESTARTS   AGE     IP             NODE         NOMINATED NODE   READINESS GATES
    pod/secondapp   2/2     Running   3          3h48m   192.168.56.9   instance-2   <none>           <none>
    
    NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE   SELECTOR
    service/kubernetes   ClusterIP      10.96.0.1       <none>        443/TCP        47h   <none>
    service/secondapp    LoadBalancer   10.104.151.51   <pending>     80:32000/TCP   94m   example=second
    

    Yes 404 does come from the ingress controller and i am happy that it works on GCE.
    The strange part is that after the ingress has been activated, the indivdual PO and SVC ip is no longer accessible.

    For e.g.
    Both curl 10.104.151.51 & curl 192.168.56.9
    do not return nginx default html response.

    Hence this might be the result why it returns 404. However i do not know why this happens after traefik ingress activation. Any advice?

  • also the terminal stuck when i ran curl 192.168.56.9 & curl 10.104.151.51.

    A deep dive to check why... the container has weird respond.

    kubectl exec -it secondapp -c busy -- sh
    / $ nslookup secondapp
    ;; connection timed out; no servers could be reached
    
  • any clue? I am stuck for 3 days already and can't proceed with the lab exercise.

  • chrispokorni
    chrispokorni Posts: 2,346

    Hi @chrischia,

    Can you try running the same two curl commands from the instance-2 node?

    Regards,
    -Chris

  • chrischia
    chrischia Posts: 12
    edited January 2021

    @chrispokorni said:
    Hi @chrischia,

    Can you try running the same two curl commands from the instance-2 node?

    Regards,
    -Chris

    Hi Chris, they both work in instance-2 (worker node). Both return the html content.

    also from my local machine:
    curl -H "Host: www.example.com" http://[instance2-ip] does return the html content too.
    but
    curl -H "Host: www.example.com" http://[instance1-ip] fails as usual

  • chrispokorni
    chrispokorni Posts: 2,346

    Hi @chrischia,

    Thank you for checking. This behavior indicates that the issue is not related to the Kubernetes cluster and any of its components: Pods, Services, and Ingress, but with the way the networking was configured on the underlying infrastructure.

    Where did you provision your nodes? In the cloud? Local VMs (which hypervisor)?
    For your nodes, did you allow traffic to all ports, all protocols, from all sources - either through a custom VPC and firewall rule or through local hypervisor config options?

    Regards,
    -Chris

  • chrischia
    chrischia Posts: 12
    edited January 2021

    Thanks @chrispokorni for your prompt reply.
    I am using GCE and have been following the tutorial and guide suggested in the LFD259.
    I was able to do the curl on both secondapp pod and svc oninstance-1.
    But the problem arises after i setup the traefik 1.7.13 on the instance-1.

    I followed the traefik setup using the link below:
    https://doc.traefik.io/traefik/v1.7/

    I am wondering if i should use AWS instead...
    and for the sake of completing the course, can i continue the Lab 7.2 part 8 using Instance-2 (for those using CURL commands)?

  • serewicz
    serewicz Posts: 1,000

    Hello,

    Could you look at the output of kubectl get pods --all-namespaces and ensure that all pods on both systems are running. You should see an ingress controller pod running on both nodes.

    Also please check that both nodes are running properly and have enough resources. If you're using 2cpu/8G nodes you should be fine with what the exercises ask you to run.

    If the ingress controller is working on one node, but not another it indicates the issue is not with the ingress controller, but some other configuration or issue. If the issue were with Kubernetes or an improper setting of a rule or the ingress controller it would not work anywhere.

    Regards,

  • Hi,
    yes i now see that there are issues with calico-nodes.

    NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
    default       secondapp                                  2/2     Running   21         21h
    kube-system   calico-kube-controllers-744cfdf676-mm5wq   1/1     Running   0          2d17h
    kube-system   calico-node-9b6zs                          0/1     Running   0          2d17h
    kube-system   calico-node-lpfg2                          0/1     Running   0          2d17h
    kube-system   coredns-f9fd979d6-2frqp                    1/1     Running   0          2d17h
    kube-system   coredns-f9fd979d6-vm5rt                    1/1     Running   0          2d17h
    kube-system   etcd-instance-1                            1/1     Running   0          2d17h
    kube-system   kube-apiserver-instance-1                  1/1     Running   0          2d17h
    kube-system   kube-controller-manager-instance-1         1/1     Running   0          2d17h
    kube-system   kube-proxy-6cqjp                           1/1     Running   0          2d17h
    kube-system   kube-proxy-7kv7v                           1/1     Running   0          2d17h
    kube-system   kube-scheduler-instance-1                  1/1     Running   0          2d17h
    kube-system   traefik-ingress-controller-w6sdr           1/1     Running   0          18h
    multitenant   mainapp-64f7bb4cc6-lghj7                   1/1     Running   0          18h
    

    when doing describe on the individual node. I see one peculiar event.

    Events:
      Type     Reason     Age                   From                 Message
      ----     ------     ----                  ----                 -------
      Warning  Unhealthy  31s (x6713 over 18h)  kubelet, instance-1  (combined from similar events): Readiness probe failed: 2021-01-22 02:28:18.334 [INFO][31504] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.9
    
  • Ok. i have resolved the calico-node issues with the following help.
    https://stackoverflow.com/questions/54465963/calico-node-is-not-ready-bird-is-not-ready-bgp-not-established

    Now the direct curl command to the ip of pod and svc works!

  • All issues are almost fixed except for one weird behaviour which i have noticed.

    From [instance-1] (master node), ip a would yield the following output for ens4 (the only ens)

    2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000
        link/ether 42:01:0a:02:00:08 brd ff:ff:ff:ff:ff:ff
        inet 10.2.0.8/32 scope global dynamic ens4
           valid_lft 1930sec preferred_lft 1930sec
        inet6 fe80::4001:aff:fe02:8/64 scope link 
           valid_lft forever preferred_lft forever
    

    While the output of ens shows clearly that i will need to use 10.2.0.8 for step 8 of Lab 7.2.
    curl -H "Host: www.example.com" http://10.2.0.8/
    this commands fails with 404.

    From the previous calico node issue which i have encountered for the week, we see that the connection of BGP fails at
    10.2.0.9.
    I did a try with
    curl -H "Host: www.example.com" http://10.2.0.9/
    this commands works perfectly with the html response.

  • serewicz
    serewicz Posts: 1,000

    Hello,

    Correct, as the the network configuration is not proper, as happens if IP ranges overlap, the traffic may be sent across the tunnel interface to the other node, which has the 10.2.0.9 IP address. If you curl from the worker node it would be interesting if the traffic were to work to .8 instead.

    Regards,

  • From [instance-2] (worker node), ip a would yield the following output for ens4 (the only ens).

    2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000
        link/ether 42:01:0a:02:00:09 brd ff:ff:ff:ff:ff:ff
        inet 10.2.0.9/32 scope global dynamic ens4
           valid_lft 2560sec preferred_lft 2560sec
        inet6 fe80::4001:aff:fe02:9/64 scope link 
           valid_lft forever preferred_lft forever
    

    The ens4 is different for worker. It is 10.2.0.9 instead.
    Hence,
    curl -H "Host: www.example.com" http://10.2.0.9/ works fine
    curl -H "Host: www.example.com" http://10.2.0.8/ doesn't work as expected here.

    Is there any way to change the ens4 in my master to point to the 10.2.0.9, same as the worker?

  • it is also obvious but would like to mention here that
    user@laptop:˜$ curl -H "Host: www.example.com" http://[worker ip] works perfectly fine since the curl works in the example above (to 9 its ens ip)

  • chrispokorni
    chrispokorni Posts: 2,346

    Hi @chrischia,

    This would imply that both nodes share the same Private IP address, introducing even more conflicts into the cluster :/

    Regards,
    -Chris

Categories

Upcoming Training