Welcome to the Linux Foundation Forum!

LFD259 - Updated to v1.22.1 (10.14.2021)

Hi,

LFD259 has been updated to Kubernetes v1.22.1. There is new content and labs to match the updated CKAD exam domains and competencies. To ensure you have access to the latest materials, please clear your cache.

Regards,
Flavia
The Linux Foundation Training Team

Comments

  • The line for /etc/containers/storage.conf needs to be restored. The absence of those lines ends up with the "worker" node failing to start up critical containers.

    diff --git a/SOLUTIONS/s_02/k8sSecond.sh b/SOLUTIONS/s_02/k8sSecond.sh
    index 44060c2..adc691f 100755
    --- a/SOLUTIONS/s_02/k8sSecond.sh
    +++ b/SOLUTIONS/s_02/k8sSecond.sh
    @@ -49,6 +49,9 @@ sudo apt-get update
     # Install cri-o
     sudo apt-get install -y cri-o cri-o-runc podman buildah
    
    +# A bug fix to get past a cri-o update
    +sudo sed -i 's/,metacopy=on//g' /etc/containers/storage.conf
    +
     sleep 3
    
     sudo systemctl daemon-reload
    

    kube-system pod/calico-node-djwk2 0/1 Init:0/3 0 4m23s
    kube-system pod/kube-proxy-dplzb 0/1 ContainerCreating 0 4m23s

    $ kubectl describe -n kube-system pod/calico-node-djwk2

    Events:
      Type     Reason                  Age                  From               Message
      ----     ------                  ----                 ----               -------
      Normal   Scheduled               5m59s                default-scheduler  Successfully assigned kube-system/calico-node-djwk2 to worker
      Warning  FailedCreatePodSandBox  5m56s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to mount container k8s_POD_calico-node-djwk2_kube-system_7e9aa728-8fe9-4a29-bedf-40988dff8741_0 in pod sandbox k8s_calico-node-djwk2_kube-system_7e9aa728-8fe9-4a29-bedf-40988dff8741_0(9ba0a861831c3a59c3382e17221fc83a63c2240fe3d6e12f1e9ca9355915637e): error creating overlay mount to /var/lib/containers/storage/overlay/0ddf29c0abf8761cf2aa6c28c1f4022e3064636ecb9ea95c7b04bbe8f0bcd2e0/merged, mount_data="nodev,metacopy=on,lowerdir=/var/lib/containers/storage/overlay/l/LAOCFIHBLQ5ZHEGKPTYC6N36OP,upperdir=/var/lib/containers/storage/overlay/0ddf29c0abf8761cf2aa6c28c1f4022e3064636ecb9ea95c7b04bbe8f0bcd2e0/diff,workdir=/var/lib/containers/storage/overlay/0ddf29c0abf8761cf2aa6c28c1f4022e3064636ecb9ea95c7b04bbe8f0bcd2e0/work": invalid argument
    

    $ kubectl describe -n kube-system pod/kube-proxy-dplzb

    Events:
      Type     Reason                  Age                    From               Message
      ----     ------                  ----                   ----               -------
      Normal   Scheduled               7m33s                  default-scheduler  Successfully assigned kube-system/kube-proxy-dplzb to worker
      Warning  FailedCreatePodSandBox  7m30s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to mount container k8s_POD_kube-proxy-dplzb_kube-system_7fc66509-09a3-46e2-b118-39a7fe9c3139_0 in pod sandbox k8s_kube-proxy-dplzb_kube-system_7fc66509-09a3-46e2-b118-39a7fe9c3139_0(9975181c1522f44afbb0b34af0f6bbadfc90e3cefff42cda4cc434fbb93c235b): error creating overlay mount to /var/lib/containers/storage/overlay/85b1b5186e0d6506db20d5a31b9ddbdea0627ff19e0c6a411d30f30a2d82b3de/merged, mount_data="nodev,metacopy=on,lowerdir=/var/lib/containers/storage/overlay/l/LAOCFIHBLQ5ZHEGKPTYC6N36OP,upperdir=/var/lib/containers/storage/overlay/85b1b5186e0d6506db20d5a31b9ddbdea0627ff19e0c6a411d30f30a2d82b3de/diff,workdir=/var/lib/containers/storage/overlay/85b1b5186e0d6506db20d5a31b9ddbdea0627ff19e0c6a411d30f30a2d82b3de/work": invalid argument
    
  • serewicz
    serewicz Posts: 1,000

    Hello,

    Could you tell me about your environment. I am not seeing this error, but would like to understand where it could be found. Are you using VirtualBox or a cloud provider? What version of the OS are you using, etc?

    Regards,

  • TL;DR: these are Ubuntu 18.04 VMs, running on a KVM hypervisor

    Sure. These are Ubuntu 18.04 VMs, based off of upstream provided base cloud image. They are running on a Ubuntu 20.04 hypervisor.

    https://cloud-images.ubuntu.com/releases/bionic/release/ubuntu-18.04-server-cloudimg-amd64.img

    Here is the virt-install script (for the 'worker' VM; the 'cp' script is the same except for the --name argument):

    export OS_QCOW2=/var/lib/libvirt/images/worker.qcow2
    scp /home/ubuntu/_images/ubuntu-18.04-server-cloudimg-amd64.img root@colo4:$OS_QCOW2
    ssh root@colo4 qemu-img resize $OS_QCOW2 20G
    
    virt-install \
      --connect qemu+ssh://root@colo4/system \
      --import \
      --virt-type kvm \
      --name worker \
      --ram 8192 \
      --cpu host-model \
      --vcpus 2 \
      --network bridge=br0,model=virtio \
      --disk path=$OS_QCOW2,format=qcow2 \
      --os-type=linux \
      --os-variant ubuntu18.04 \
      --autostart \
      --autoconsole text \
      --cloud-init user-data=./user-data,meta-data=./meta-data
    

    Post virt-install, this is running the following scripts from the 10-13 tarball:

    • LFD259/SOLUTIONS/s_02/k8scp.sh
    • LFD259/SOLUTIONS/s_02/k8sSecond.sh

    The problem arises after the 'kubeadm join ...' command. I am reproducing this right now. The absence of the 'sed' line results in:

    $ kubectl get pod -n kube-system
    NAME                                       READY   STATUS              RESTARTS   AGE
    calico-kube-controllers-75f8f6cc59-wj5rf   1/1     Running             0          12m
    calico-node-4knm9                          1/1     Running             0          12m
    calico-node-5w7cd                          0/1     Init:0/3            0          9m27s
    coredns-78fcd69978-26rtv                   1/1     Running             0          12m
    coredns-78fcd69978-84hhb                   1/1     Running             0          12m
    etcd-cp                                    1/1     Running             0          12m
    kube-apiserver-cp                          1/1     Running             0          12m
    kube-controller-manager-cp                 1/1     Running             0          12m
    kube-proxy-62qw2                           0/1     ContainerCreating   0          9m27s
    kube-proxy-zbq87                           1/1     Running             0          12m
    kube-scheduler-cp                          1/1     Running             0          12m
    
  • Starting from scratch again, and restoring the 3 lines as shown in the diff in my original response, the problem goes away; the two "stuck" pods are created after a minute or so.

    k get pod -n kube-system -w

    calico-node-d99gh 1/1 Running 0 66s

    Let me know if more is needed from me on this, I'm happy to help.

  • @kingphil said:
    The line for /etc/containers/storage.conf needs to be restored. The absence of those lines ends up with the "worker" node failing to start up critical containers.

    diff --git a/SOLUTIONS/s_02/k8sSecond.sh b/SOLUTIONS/s_02/k8sSecond.sh
    index 44060c2..adc691f 100755
    --- a/SOLUTIONS/s_02/k8sSecond.sh
    +++ b/SOLUTIONS/s_02/k8sSecond.sh
    @@ -49,6 +49,9 @@ sudo apt-get update
     # Install cri-o
     sudo apt-get install -y cri-o cri-o-runc podman buildah
     
    +# A bug fix to get past a cri-o update
    +sudo sed -i 's/,metacopy=on//g' /etc/containers/storage.conf
    +
     sleep 3
     
     sudo systemctl daemon-reload
    

    kube-system pod/calico-node-djwk2 0/1 Init:0/3 0 4m23s
    kube-system pod/kube-proxy-dplzb 0/1 ContainerCreating 0 4m23s

    $ kubectl describe -n kube-system pod/calico-node-djwk2

    Events:
      Type     Reason                  Age                  From               Message
      ----     ------                  ----                 ----               -------
      Normal   Scheduled               5m59s                default-scheduler  Successfully assigned kube-system/calico-node-djwk2 to worker
      Warning  FailedCreatePodSandBox  5m56s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to mount container k8s_POD_calico-node-djwk2_kube-system_7e9aa728-8fe9-4a29-bedf-40988dff8741_0 in pod sandbox k8s_calico-node-djwk2_kube-system_7e9aa728-8fe9-4a29-bedf-40988dff8741_0(9ba0a861831c3a59c3382e17221fc83a63c2240fe3d6e12f1e9ca9355915637e): error creating overlay mount to /var/lib/containers/storage/overlay/0ddf29c0abf8761cf2aa6c28c1f4022e3064636ecb9ea95c7b04bbe8f0bcd2e0/merged, mount_data="nodev,metacopy=on,lowerdir=/var/lib/containers/storage/overlay/l/LAOCFIHBLQ5ZHEGKPTYC6N36OP,upperdir=/var/lib/containers/storage/overlay/0ddf29c0abf8761cf2aa6c28c1f4022e3064636ecb9ea95c7b04bbe8f0bcd2e0/diff,workdir=/var/lib/containers/storage/overlay/0ddf29c0abf8761cf2aa6c28c1f4022e3064636ecb9ea95c7b04bbe8f0bcd2e0/work": invalid argument
    

    $ kubectl describe -n kube-system pod/kube-proxy-dplzb

    Events:
      Type     Reason                  Age                    From               Message
      ----     ------                  ----                   ----               -------
      Normal   Scheduled               7m33s                  default-scheduler  Successfully assigned kube-system/kube-proxy-dplzb to worker
      Warning  FailedCreatePodSandBox  7m30s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to mount container k8s_POD_kube-proxy-dplzb_kube-system_7fc66509-09a3-46e2-b118-39a7fe9c3139_0 in pod sandbox k8s_kube-proxy-dplzb_kube-system_7fc66509-09a3-46e2-b118-39a7fe9c3139_0(9975181c1522f44afbb0b34af0f6bbadfc90e3cefff42cda4cc434fbb93c235b): error creating overlay mount to /var/lib/containers/storage/overlay/85b1b5186e0d6506db20d5a31b9ddbdea0627ff19e0c6a411d30f30a2d82b3de/merged, mount_data="nodev,metacopy=on,lowerdir=/var/lib/containers/storage/overlay/l/LAOCFIHBLQ5ZHEGKPTYC6N36OP,upperdir=/var/lib/containers/storage/overlay/85b1b5186e0d6506db20d5a31b9ddbdea0627ff19e0c6a411d30f30a2d82b3de/diff,workdir=/var/lib/containers/storage/overlay/85b1b5186e0d6506db20d5a31b9ddbdea0627ff19e0c6a411d30f30a2d82b3de/work": invalid argument
    

    I agree with this. I am facing the same issue.
    I am using Virtualbox on Windows 11 Host with Ubuntu 18.04 Guest.

  • How come the course still uses the "master" term?

  • coop
    coop Posts: 916

    The Linux Foundation is fully on board with the "inclusive naming initiative" (https://inclusivenaming.org/ and has systematically implemented its guidelines every where it can in courses. Unfortunately, there are quite a few upstream projects that still use "master" and it is impossible to eliminate its use completely until they do, at which point the courses automatically will not be using these terms. It takes time, and even though CNCF is an initiator of the inclusive naming project, quite a few CNCF projects still use "master".

  • Faced this issue in Lab 2.3 I am running the nodes on GCP, ubuntu 18.04LTS.

  • @kingphil said:
    The line for /etc/containers/storage.conf needs to be restored. The absence of those lines ends up with the "worker" node failing to start up critical containers.

    diff --git a/SOLUTIONS/s_02/k8sSecond.sh b/SOLUTIONS/s_02/k8sSecond.sh
    index 44060c2..adc691f 100755
    --- a/SOLUTIONS/s_02/k8sSecond.sh
    +++ b/SOLUTIONS/s_02/k8sSecond.sh
    @@ -49,6 +49,9 @@ sudo apt-get update
     # Install cri-o
     sudo apt-get install -y cri-o cri-o-runc podman buildah
     
    +# A bug fix to get past a cri-o update
    +sudo sed -i 's/,metacopy=on//g' /etc/containers/storage.conf
    +
     sleep 3
     
     sudo systemctl daemon-reload
    

    kube-system pod/calico-node-djwk2 0/1 Init:0/3 0 4m23s
    kube-system pod/kube-proxy-dplzb 0/1 ContainerCreating 0 4m23s

    $ kubectl describe -n kube-system pod/calico-node-djwk2

    ```

    Confirming this post! Guys, please fix it in the "k8sSecond.sh", I have wasted more than an hour to figure out why the steps are not working on VMware fusion.

  • Hi @vardanm,

    The step works successfully in some tested cloud settings - AWS and GCP, while it seems to misbehave during local installations.

    Regards,
    -Chris

  • t.white.ucf
    t.white.ucf Posts: 1
    edited September 2023

    I believe I am experiencing this same issue still. @serewicz

  • Hi @t.white.ucf,

    Are you following the latest release of the lab guide? The "metacopy" reference is no longer part of any of the installation scripts. The course content release referenced by this discussion is close to two years old.

    Regards,
    -Chris

Categories

Upcoming Training