Welcome to the Linux Foundation Forum!

lab 3.x

I'm just getting started with the labs and I've hit a bit of trouble right off the bat, I'm not sure which direction to explore for possible solution.

I'm installing k8s using kubeadm, my infra is AWS based, I have my own VPC (might be something with the network setup), inside the VPC which is accessible from the internet of course I have 2 ubuntu ec2 instances, a master and a worker.
The security group for each instance has the inbound rules as described here:
https://kubernetes.io/docs/setup/independent/install-kubeadm/

I was able to complete lab 3.1 almost to the letter, the only issue I saw was with the commands :
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

I kept getting an error saying sudo: unable to resolve host ip-10-0-..

by this point the master is in ready state and all pods (including calico) are running so I pushed forward
at lab 3.2 I was able to bootstrap the worker, but when that joined the master I have one calico pod in error mode, everything else was according to the lab description so I pushed forward again
I stopped at 3.3 as the nginx pod is stuck in containerCreation, the description of the pod gives back this:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1m default-scheduler Successfully assigned default/nginx-64f497f8fd-d9pth to ip-10-0-1-111
Warning FailedCreatePodSandBox 10s kubelet, ip-10-0-1-111 Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "8196208e2cf244509e49b6fedc7952042a79197763a1dc751b96a8ce17e4a313" network for pod "nginx-64f497f8fd-d9pth": NetworkPlugin cni failed to set up pod "nginx-64f497f8fd-d9pth_default" network: Unable to retreive ReadyFlag from Backend: client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://10.96.232.136:6666 exceeded header timeout
, failed to clean up sandbox container "8196208e2cf244509e49b6fedc7952042a79197763a1dc751b96a8ce17e4a313" network for pod "nginx-64f497f8fd-d9pth": NetworkPlugin cni failed to teardown pod "nginx-64f497f8fd-d9pth_default" network: Unable to retreive ReadyFlag from Backend: client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://10.96.232.136:6666 exceeded header timeout
]
Normal SandboxChanged 9s kubelet, ip-10-0-1-111 Pod sandbox changed, it will be killed and re-created.

The problem seems obvious? I get something similar from the calico pod that's failing as in it's unhappy cuz of etcd, but installing etcd and/or configuring it was not in the labs as far as I can tell, what am I missing?

Please advise.

Regards,
Naim

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Comments

  • Posts: 1,000

    Hello Naim,
    I have not seen this error when working with kubeadm, but I have seen sudo errors on nodes where the current hostname is not in the /etc/hosts file. Did you update the hostname?

    If the .kube/config file does not have the proper server IP and port listed the kubectl command won't know where to send the APIs.

    Regards,

  • I've seen very small issues cause big problems so let's explore that, my master host seems to be called ip-10-0-1-158
    currently my /etc/hosts looks like this:

    127.0.0.1 localhost

    The following lines are desirable for IPv6 capable hosts

    ::1 ip6-localhost ip6-loopback
    fe00::0 ip6-localnet
    ff00::0 ip6-mcastprefix
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters
    ff02::3 ip6-allhosts

    are you suggestion I add my hostname like so?
    ip-10-0-1-158 localhost

  • Posts: 1,000

    Well,
    That looks just like my /etc/hosts file as well, without the inclusion of the specific hostname. So it must be something else.

    You logged into the node and then used sudo -i to become root? Did that work prior to running kubeadm? If it did, but after you exit back to a non-root user that would be quite strange.

    What IP address did you use when you ran** kubeadm init?** Perhaps there is a conflict between Calico and the local node?

    Regards,

  • Hi Naim,
    I see a timeout on port 6666, which is not included in the ports section at "Installing kubeadm". Since SGs act as firewalls, can you try opening your SG to all traffic? Not a best practice, I know, but for the purpose of completing these labs it may help.
    Regards,
    -Chris

  • Posts: 3
    edited September 2018

    Incredible.. Chris it was the port thing!! as soon as I opened it on both the master sg and worker sg the nginx pod is up and running.
    Thank you so much don't know how I didn't think about that myself, I was more focused on the etcd thing as it struck me more significant

    I'm still new here but this can be marked as resolved

  • Glad to hear it got resolved and it works now!
    -Chris

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Categories

Upcoming Training