Can you help me debug the master initialization?

alaudazzi · January 2021

I get the error timed out waiting for the condition when I try to initialize the master:

root@master:~# kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
W0103 15:13:11.353300    8986 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.1
[preflight] Running pre-flight checks
    [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8smaster] and IPs [10.96.0.1 10.2.0.2]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master localhost] and IPs [10.2.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master localhost] and IPs [10.2.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0103 15:13:33.458024    8986 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0103 15:13:33.459355    8986 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.

    Unfortunately, an error has occurred:
        timed out waiting for the condition

    This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

    If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

    Additionally, a control plane component may have crashed or exited when started by the container runtime.
    To troubleshoot, list all containers using your preferred container runtimes CLI.

    Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

alaudazzi · January 2021

I've been following the steps in the lab very carefully, without deviating in any way. It's not clear to me what went wrong. I would really appreciate some help to move forward. I am currently blocked and cannot continue with the course.

alaudazzi · January 2021

GCE

alaudazzi · January 2021

I've just fixed a typo in the file /etc/hosts and rerun kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
Now I get this:

root@master:~# kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
W0103 20:06:49.751110   19618 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.1
[preflight] Running pre-flight checks
    [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
    [ERROR Port-6443]: Port 6443 is in use
    [ERROR Port-10259]: Port 10259 is in use
    [ERROR Port-10257]: Port 10257 is in use
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
    [ERROR Port-10250]: Port 10250 is in use
    [ERROR Port-2379]: Port 2379 is in use
    [ERROR Port-2380]: Port 2380 is in use
    [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

alaudazzi · January 2021

Docker is correctly installed:

root@master:~# docker -v
Docker version 19.03.6, build 369ce74a3c

alaudazzi · January 2021

The VPC I created should allow all traffic to all ports:

alaudazzi · January 2021

n1-standard-2 (2 vCPUs, 7.5 GB memory) with Ubuntu, 18.04 LTS

alaudazzi · January 2021

This is the output of sudo systemctl status docker:

root@master:~# sudo systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
   Active: active (running) since Sun 2021-01-03 15:13:12 UTC; 6h ago
     Docs: https://docs.docker.com
 Main PID: 9004 (dockerd)
    Tasks: 17
   CGroup: /system.slice/docker.service
           └─9004 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
root@master:~#

alaudazzi · January 2021

Here is the most recent command history:

root@master:~# history
    1  apt-get update && apt-get upgrade -y
    2  apt-get install -y vim
    3  vim
    4  apt-get install -y docker.io
    5  pwd
    6  ls
    7  vim /etc/apt/source.list.d/kubernetes.list
    8  vim /etc/apt/sources.list.d/kubernetes.list
    9  curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
   10  pwd
   11  ls
   12  apt-get update
   13  apt-get install -y kubeadm=1.18.1-00 kubelet=1.18.1-00 kubectl=1.18.1-00
   14  apt-mark hold kubelet kubeadm kubectl
   15  wget https://docs.projectcalico.org/manifests/calico.yaml
   16  less
   17  less calico.yaml
   18  ip addr show
   19  vim /etc/hosts
   20  vim kubeadm-config.yaml
   21  kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
   22  less calico.yaml
   23  kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
   24  vi kubeadm-init.out
   25  docker ps -a | grep kube | grep -v pause
   26  systemctl status kubelet
   27  journalctl -xeu kubelet
   28  vim /etc/hosts
   29  apt-get update
   30  less calico.yaml
   31  ip addr show
   32  vim /etc/hosts
   33  vim kubeadm-config.yaml 
   34  kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
   35  wget https://docs.projectcalico.org/manifests/calico.yaml
   36  less calico.yaml
   37  kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
   38  docker ps
   39  docker -v
   40  sudo systemctl status docker
   41  history > history.out
   42  history
root@master:~#

alaudazzi · January 2021

I fixed the problem by running kubeadm reset.

Can you help me debug the master initialization?

Answers

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)