Welcome to the Linux Foundation Forum!

lab 3.5 step 23 gets error validating "cilium-cni.yaml"

I am working through lab 16.2, attempting to get 2 more control planes installed. I am working through the lab 3.5 instructions and keep getting this error when running the kubectl apply to install cilium.

error: error validating "cilium-cni.yaml": error validating data: failed to download openapi: Get "https://k8scp:6443/openapi/v2?timeout=32s": dial tcp: lookup k8scp on 127.0.0.53:53: server misbehaving; if you choose to ignore these errors, turn validation off with --validate=false

I have verified I am not installing as root. I have also tried the helm install method. Same error either way on two different VMs now. What am I missing?

Answers

  • chrispokorni
    chrispokorni Posts: 2,419

    Hi @don.perkins,

    Assuming you already have an operational cluster, with Cilium CNI plugin active, there is no need to re-install it anywhere else. The very first control plane node operates the CNI plugin for the entire cluster. The CNI plugin agents and the Pod network will expand onto the additional control plane nodes as they join the HA.

    Regards,
    -Chris

  • great! thanks. so working from the lab instructions from lab 3.5 (installing the first control plane), for the SECOND and THIRD control planes, can I stop at step 17 after installing kubeadm, kubectl and kubelet?

  • chrispokorni
    chrispokorni Posts: 2,419

    Yes, @don.perkins.

  • ackuakud
    ackuakud Posts: 3

    Hello @chrispokorni I'm facing a similar issue as this except mine is occurring on the cp.

    error: error validating "/home/student/LFS258/SOLUTIONS/s_03/cilium-cni.yaml": error validating data: failed to download openapi: Get "http://localhost:8080/openapi/v2?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused; if you choose to ignore these errors, turn validation off with --validate=false

    I'm running the command as a regular user. My instance is a GCP one following the lab guide.

  • chrispokorni
    chrispokorni Posts: 2,419
    edited January 16

    Hi @ackuakud,

    What entries do you have in your hosts files (cp and worker respectively)?
    What are the specs of your cluster? Which cloud or hypervisor provisions your VMs? What OS, how much CPU, RAM, disk size (fully allocated or dynamic)? How many network interfaces per VM?
    What are the private IP addresses of your VMs?

    Regards,
    -Chris

  • Zen42_fo
    Zen42_fo Posts: 1
    edited January 16

    i'm having the same issue. The cluster seems to init ok, stay up for a while and then randomly die and crash reboot.

    I've checked what the labs say and my config seems to be correct

  • chrispokorni
    chrispokorni Posts: 2,419

    Hi @Zen42_fo,

    What are the specs of your cluster? Which cloud or hypervisor provisions your VMs? What OS, how much CPU, RAM, disk size (fully allocated or dynamic)? How many network interfaces per VM?
    What are the entries in your hosts files (cp and worker respectively)? What are the private IP addresses of your VMs?

    Regards,
    -Chris

  • ackuakud
    ackuakud Posts: 3

    Hello @chrispokorni

    The attached image is a copy of my hosts file on the cp . I have not done anything on the worker node yet. I just set up the VM.

    Hypervisor - Google cloud
    OS - Ubuntu 20.04.06
    CPU - e2-standard-2 (2 vCPU 1 core)
    RAM - 8gb
    The CP instance is attached to 1 network interface.

    Private IPs for the VMs master - 10.3.0.2 ; worker - 10.3.0.3

    Thank you.
    -Dan

  • chrispokorni
    chrispokorni Posts: 2,419

    Hi @ackuakud,

    On GCP I would take a second look at the VPC/firewall configuration, to ensure all inbound traffic is allowed to the VM(s). That timeout seems to be a networking issue. Also, ensure the VMs have enough disk space, a minimum of 15-20GB, and install the recommended Ubuntu 24.04 LTS release.

    What is the output of:

    ls -la /home/student/LFS258/SOLUTIONS/s_03/
    

    Regards,
    -Chris

  • ackuakud
    ackuakud Posts: 3

    Hello @chrispokorni I just ran the ls command here is the output.

    For the storage I have 20gb . I will doublecheck my Firewall rules. Thank you!

  • chrispokorni
    chrispokorni Posts: 2,419

    Hi @ackuakud,

    With the GCE VM private IP addresses 10.3.0.x, I would recommend resetting your cluster (run kubeadm reset command as root on the control plane node) and re-initializing (run the full kubeadm init ... command as root) after making a minor edit to the cilium-cni.yaml manifest:

    • around line 222 update the value of cluster-pool-ipv4-cidr: "192.168.0.0/16"
    • this cidr value should match the podSubnet: 192.168.0.0/16 value in the kubeadm-config.yaml manifest

    This will ensure the networking is properly defined for the cluster, and IP ranges do not overlap between VMs, Services, and Pods.

    Regards,
    -Chris

Categories

Upcoming Training