Welcome to the Linux Foundation Forum!

lab 4.6 error

When initially running lab 4-6, I knew it was expected to fail, but I somehow doubt very much this is the expected failure. What is going on here?

Warning FailedCreatePodSandBox 11s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_init-tester_default_6404e3fd-e1fa-45cb-bfd8-0911a24422a5_0(c10d5d863c5297f161be95320c037cf891648221f05d9beec88fa315b58549ca): error adding pod default_init-tester to CNI network "k8s-pod-network": error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

I know what those words mean but how can that even happen in this environment? Is something out of date?

Comments

  • serewicz
    serewicz Posts: 1,000

    Hello,

    Indeed this is an unexpected error. Well, they all are unexpected, but this one more than most. To help troubleshoot:

    • What are you using to run your cluster?
    • What version of OS and Kubernetes are you using?
    • Have you turned anything on or off since initializing the cluster?
    • Do you get this error with any other commands, like kubectl create deploy?

    Regards,

  • I am using Ubuntu 20.04.4 LTS with VMWare workstation, kubernetes version 1.23.1.

    Regarding your third question, I believe this would not be the case. Prior to running k8scp.sh, I saved a snapshot. I recently reverted to this snapshot (and updated the files) because my old build (from late December) would encounter a 404 error from quay.io every time it tried to pull nginx (and I'm not even sure if I still have that problem!)

    As a result, you could say I ran initial setup (apparmor off, etc), ran 2.1, 2.2, then attempted 4.6. As far as the VM is concerned, that is.

    As a sanity check, I ran kubectl create deploy --image=nginx test. We have the following:
    eric@ubuntu:~$ kubectl describe deployment
    Name: test
    Namespace: default
    CreationTimestamp: Thu, 31 Mar 2022 16:03:11 -0700
    Labels: app=test
    Annotations: deployment.kubernetes.io/revision: 1
    Selector: app=test
    Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
    StrategyType: RollingUpdate
    MinReadySeconds: 0
    RollingUpdateStrategy: 25% max unavailable, 25% max surge
    Pod Template:
    Labels: app=test
    Containers:
    nginx:
    Image: nginx
    Port:
    Host Port:
    Environment:
    Mounts:
    Volumes:
    Conditions:
    Type Status Reason
    ---- ------ ------
    Available False MinimumReplicasUnavailable
    Progressing True ReplicaSetUpdated
    OldReplicaSets:
    NewReplicaSet: test-8499f4f74 (1/1 replicas created)
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal ScalingReplicaSet 2m6s deployment-controller Scaled up replica set test-8499f4f74 to 1

    kubectl describe pod
    Name: test-8499f4f74-njzvz

    Warning FailedCreatePodSandBox 7m58s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_test-8499f4f74-njzvz_default_0b9c3d93-387c-4b78-9bce-b419fc4a45a7_0(73d1561f2f5946edf5d5266e4e8519b2352191c382faca2c9f60ee02bf386f53): error adding pod default_test-8499f4f74-njzvz to CNI network "k8s-pod-network": error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

  • serewicz
    serewicz Posts: 1,000

    Ah, a snapshot. That's probably the issue.

    If your IP address changes when the snapshot is restored, such as when using DHCP, then the x509 certificate no longer will work as it was tied to the original IP. When running the kubeadm init you can pass a config file and reference a hostname instead. Then the cert will work if the IP changes. For example if you have an /etc/hosts entry for k8scp you could use a kubeadm-config.yaml file like this, minus the number the codeblock puts in:

    apiVersion: kubeadm.k8s.io/v1beta3
    kind: ClusterConfiguration
    kubernetesVersion: 1.23.1
    controlPlaneEndpoint: "k8scp:6443"
    networking:
      podSubnet: 192.168.0.0/16
    

    More on that file here: https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta3/

    Regards,

  • Mr. Serewicz,

    Would that be the case if the snapshot reverts back to only the end of 2.1? (After your course package has arrived from wget but prior to running k8scp? Looking into your suggestion either way, but thought I'd run this by you while I did.

  • Update: Tried the whole thing again. I skipped adding a worker node for now as I don't think 4.6 will require it. This is the result of a simple nginx deployment

    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Warning FailedScheduling 2m8s (x2 over 3m25s) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
    Normal Scheduled 87s default-scheduler Successfully assigned default/test-8499f4f74-rhnl5 to cp
    Normal Pulling 86s kubelet Pulling image "nginx"
    Normal Pulled 76s kubelet Successfully pulled image "nginx" in 9.751208344s
    Normal Created 76s kubelet Created container nginx
    Normal Started 76s kubelet Started container nginx

Categories

Upcoming Training