Welcome to the Linux Foundation Forum!

DNS does not working

minherz
minherz Posts: 3
edited March 27 in LFS260 Class Forum

Hi,
I have a problem with the two nodes cluster setup. I've followed the lab for in the Lesson 4 and have the cluster installed.
However, when I try to follow the instructions in the lab for the Lesson 5 (Exercise 5.3 - step 4) the curl command fails to resolve "kubernetes.default". The error is:

curl: (6) Could not resolve host: kubernetes.default
command terminated with exit code 6

The content of the resolv.conf is:

nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local us-central1-a.c.lf-kse-course-2021.internal c.lf-kse-course-2021.internal google.internal
options ndots:5

If I try to use dnsutils image and execute nslookup kubernetes.default then I get:

;; connection timed out; no servers could be reached

command terminated with exit code 1

However, the direct access to the API server by IP (10.96.0.1) works fine. It looks like some misconfig of the CoreDNS or the routing.
My cluster is created using GCP in the subnet with 10.0.2.0/24 CIDR. Firewall allows any TCP and UDP communication between two nodes and I run everything on the master node.

Comments

  • minherz
    minherz Posts: 3

    If I execute nslookup using one of CoreDNS pods' IP as a server, I get the same result i.e. "connection time out".

  • minherz
    minherz Posts: 3

    I found the problem. Apparently, my cloud environment was missing a firewall rule. Once all TCP, UDP, ICMP and IPIP was enabled for the cluster's node the problem get resolved. I used the calico instructions since the course instructions were less descriptive for me,

  • serewicz
    serewicz Posts: 943

    Great, glad you were able to troubleshoot the issue. And thank you for letting us know what happened.

    Regards,

  • fredlev
    fredlev Posts: 2
    edited June 20

    I had the same issue recently while working on the CKS (LFS260) using a 2 nodes cluster setup on GCP.

    **Edit: ** The issue was that IPIP was not allowed internally by default between nodes.

    I reverted the change I was mentioning below and DNS worked as expected with the default config:

    calicoctl get ippools -o yaml

    apiVersion: projectcalico.org/v3
    items:
    - apiVersion: projectcalico.org/v3
      kind: IPPool
      metadata:
        creationTimestamp: "2021-06-20T13:51:37Z"
        name: default-ipv4-ippool
        resourceVersion: "5391"
        uid: f17e591b-e08a-4056-b5f4-f20ab68da83b
      spec:
        blockSize: 26
        cidr: 192.168.0.0/16
        ipipMode: Always
        natOutgoing: true
        nodeSelector: all()
        vxlanMode: Never
    kind: IPPoolList
    metadata:
      resourceVersion: "7077"
    

    Edit: kept the comments below for info purposes but the root cause of the issue was that IPIP traffic was not allowed internally in the VPC I created.

    I ended up doing the following:

    Install calicoctl on the master node:

    wget https://github.com/projectcalico/calicoctl/releases/download/v3.14.0/calicoctl \
    chmod +x calicoctl \
    sudo mv calicoctl /usr/local/bin/ 
    
    export KUBECONFIG=~/.kube/config
    export DATASTORE_TYPE=kubernetes
    

    Install guide here

    Output the existing ippool config into a file:

    calicoctl get ippools -o yaml > ippool.yaml 
    

    In the spec part set ipipMode: Never and vxlanMode: Always.

    Replaced existing config with new one:

    calicoctl replace -f ippool.yaml 
    

    Output:

    Successfully replaced 1 'IPPool' resource(s)
    

    Now my PODs have a working DNS resolution ^^

    Maybe That would not have happend if I had follow https://docs.projectcalico.org/getting-started/kubernetes/self-managed-public-cloud/gce to start with ^^

Categories

Upcoming Training