Problem with k8sMaster.sh script

guglielmino · December 2018

Hi all,

I'm trying to setup an Ubuntu 16.04 VM (in Virtual Box) using the k8sMaster.sh script, as described in the LAB 2.1 document.
I alway get:
unable to recognize "calico.yaml": Get https://10.0.2.15:6443/api?timeout=32s: dial tcp 10.0.2.15:6443: connect: connection refused

Firewall is disabled and trying to check for the port 6443 I don't see anything listening (I would expect the Kubernetes API controller).

Someone can help me?

Thanks
Fabrizio

guglielmino · December 2018

Hello,

to get these errors I installed Ubuntu 16.04 on VirtualBox. Then I connected to the machine and launched k8sMaster.sh, I took a look to the script what it does is quite simple: install Docker and than kubeadm, kubectl and kubeket. It installed everithing but then I got the described errors.

Thanks
Fabrizio

chrispokorni · December 2018

Getting vbox network settings right for the nodes to be able to talk to each other and the internet may be a bit tricky. Also make sure you have each vbox VM instance open to all traffic from all sources, to all ports, all protocols. You already mentioned that you Ubuntu fw is disabled, so you should be good there.
Regards,
-Chris

guglielmino · December 2018

Yes, I just figured out that it could be a time consuming task. Since I'm not interested in troubleshooting VirtualBox I've subscribed the GCE with the $300 free credit. What I'm experiencing now are some problems because the scripts are made for a self deployed environment. For example I think you can't deploy the calico.yml as is in the GCE ...

chrispokorni · December 2018

Calico works fine in GCE. We are not using Google's container/Kubernetes engines, so we can deploy any networking schema we want on the GCE VM instances. Same rules from before though, Ubuntu firewall disabled/inactive, Google project level or VPC level firewall open to all traffic (all ports, protocols, sources).
Regards,
-Chris

guglielmino · December 2018

Thanks. I tried to deploy calico on GCE, it works but what I don't understand is how it work in detail.
Im trying to setup the LAB 2 of the course, in my cluster now I have this condition:

    NAMESPACE     NAME                                                    READY     STATUS    RESTARTS   AGE
    default       basicpod                                                1/1       Running   0          8m
    kube-system   calico-kube-controllers-764d76f647-5fnj7                1/1       Running   0          9m
    kube-system   calico-node-vertical-autoscaler-8b959b949-bbgdv         1/1       Running   0          1h
    kube-system   calico-typha-5c9fbf65f8-v8mll                           1/1       Running   0          1h
    kube-system   calico-typha-horizontal-autoscaler-5545fbd5d6-b9x7l     1/1       Running   0          1h
    kube-system   calico-typha-vertical-autoscaler-54d8f88b84-77dsg       1/1       Running   0          1h
    kube-system   event-exporter-v0.2.3-54f94754f4-2zgrw                  2/2       Running   0          2d
    kube-system   fluentd-gcp-scaler-6d7bbc67c5-z8nh4                     1/1       Running   0          2d
    kube-system   fluentd-gcp-v3.1.0-g57md                                2/2       Running   0          1d
    kube-system   fluentd-gcp-v3.1.0-l2gcm                                2/2       Running   0          2d
    kube-system   heapster-v1.5.3-5f9cfd5669-7mnfc                        3/3       Running   0          1d
    kube-system   kube-dns-788979dc8f-9m9pt                               4/4       Running   0          2d
    kube-system   kube-dns-788979dc8f-j9htn                               4/4       Running   0          1d
    kube-system   kube-dns-autoscaler-79b4b844b9-8czlt                    1/1       Running   0          2d
    kube-system   kube-proxy-gke-kube-training-power-pool-cd2516ab-8h75   1/1       Running   0          2d
    kube-system   kube-proxy-gke-kube-training-power-pool-cd2516ab-fvt0   1/1       Running   0          1d
    kube-system   kubernetes-dashboard-598d75cb96-jjzfv                   1/1       Running   0          2d
    kube-system   l7-default-backend-5d5b9874d5-n7sbr                     1/1       Running   0          2d
    kube-system   metrics-server-v0.2.1-7486f5bd67-cqjpj                  2/2       Running   0          2d

Now, given I deployed Calico I would have expected to see the basicpod exposed on a ip address reachable with a curl from the cluster master shell. If I run kubectl get pod -o wide I get this:

NAME       READY     STATUS    RESTARTS   AGE       IP           NODE
basicpod   1/1       Running   0          14m       10.40.1.13   gke-kube-training-power-pool-cd2516ab-8h75

And with a curl http://10.40.1.13 I'm not able to connect to the pod, and it seems quite reasonable to me.
I'm missing something, I'm sure, but I can't figure out what.

Thanks
Fabrizio

chrispokorni · December 2018

Hi Fabrizio,
Are you using GKE nodes (Google Kubenetes Engine)? I am assuming this based on your outputs. Using Google's Kubernetes nodes may produce different results/outputs from the ones presented in the Labs. The Labs have been completed on GCE - Google Compute Engine VM instances running Ubuntu 16.04 LTS, with Docker, Kubernetes, calico, etc installed from scratch - via the k8sMaster.sh script.
Regards,
-Chris

guglielmino · December 2018

Hi Chris,
thank you for your answer. Yes, I'm using GKE, my goal in this course is getting knowledge on Kubernetes from the Developer standpoint, then I'm not so interested in spend time to setup a cluster by myself (an then the use of GKE). Do you think it's better to start from scratch for this course?

Thanks
Fabrizio

chrispokorni · December 2018

Hi Fabrizio,
Installation and cluster setup steps have been scripted for the very same reason: this course is intended for Developers and they should not be wasting time setting up a cluster. This course focuses on the vendor-neutral Kubernetes and that's why we are using the community backed installation process, which can be performed on either cloud provider GCP, AWS, Azure, or local VirtualBox, VMware. Clearly each instance requires some tweaking, but generally, the vendor-neutral installation process will be similar.

Kubernetes changes rapidly and unfortunately what works today, may not necessarily work tomorrow. Whit that in mind, the labs as they are presented with commands and outputs, have been tested in a vendor-neutral configuration on Kubernetes 1.12.1. Any major change from this setup, such as a vendor's flavor of Kubernetes, causes other changes down the line - commands need to be tweaked to match the vendor's environment, and outputs will be different. There is no specific reason why GCE VM instances have been used - we could have used AWS EC2 instances or VirtualBox VMs - any environment which could have provided us with clean and simple VM instances to install everything from scratch: Ubuntu OS, Docker, Kubernetes, etc... Once properly setup with Ubuntu 16.04 LTS and all firewalls open, then the install scripts "should" work without any issues. "Should" because it seems in the case of Kubernetes 1.12.1, while 1.13.0 was released, some of its code affected 1.12.1 as well.

After the master script and second script have completed, the step on the worker/second node where "kubeadm join" is issued, it will cause a permission error, and the worker node will not join the cluster. While it may be easy to just install 1.13 instead, the labs have not been tested with 1.13.

I posted a solution on how to fix the permission issue during "kubeadm join" on 1.12.1 and it requires some tasks to be completed manually - not Developer-friendly, but most definitely doable

Here is the solution to fix the permission issue:
https://github.com/chris-pok/k8s-1.12.1.git

Good luck!
-Chris

guglielmino · December 2018

Thank you both for the answers, what isn't still clear to me is if the use of GKE fits for this course or is better to start deploying the cluster with the provided scripts (even inside Google Cloud but using VMs).

Thanks

guglielmino · December 2018

I ended up creating two VMs on Google Cloud (Ubuntu 16.4) and setting up the cluster as described in the LAB 2.1
All works just fine but following the lab steps I'm not able to establish the connection with the basic pod from the master.

In the lab at some point basicpod service is deployed with a containerPort set to 80, this exposes the enginx webserver. To test it I did a kubectl get pod -o wide to get the right ip address and then curl http://{ip_address} to read the exposed data. It doesn't work, but it I do the same connecting to the minion it works. This suggest to me I have some problem with the cluster network configuration but, as far as understood, this is managed by the Calico project, that's correctly deployed.

chrispokorni · December 2018

Hi, if your pod runs on the minion and you cannot curl to it from the master, there may be a networking issue between your nodes.

guglielmino · December 2018

@chrispokorni said:
Hi, if your pod runs on the minion and you cannot curl to it from the master, there may be a networking issue between your nodes.

Yes, that's what I wrote
I stopped the firewall on both master and node, Calico is installed, then I'm asking for suggestion on what to check next.

Thanks
Fabrizio

chrispokorni · December 2018

@guglielmino
Calico only helps with pods networking, not with nodes networking. Nodes networking is handled by Google Cloud.
Before creating your GCE nodes, did you setup a VPC? Do you have a firewall rule created to allow all traffic: to all ports, all protocols, from all sources? Is this firewall rule associated with the VPC? Are your nodes on the VPC network?
By default, firewall rules in Google Cloud do not allow for all traffic, blocking some which may be critical for Kubernetes' functionality.
Regards,
-Chris

guglielmino · December 2018

@chrispokorni said:
@guglielmino
Calico only helps with pods networking, not with nodes networking. Nodes networking is handled by Google Cloud.
Before creating your GCE nodes, did you setup a VPC? Do you have a firewall rule created to allow all traffic: to all ports, all protocols, from all sources? Is this firewall rule associated with the VPC? Are your nodes on the VPC network?
By default, firewall rules in Google Cloud do not allow for all traffic, blocking some which may be critical for Kubernetes' functionality.
Regards,
-Chris

Problem solved! The problem was that I disabled the firewall from inside the VM but I needed to disable the Google Cloud, it was blocking some ports and then my problems.

Thank you
Fabrizio

Problem with k8sMaster.sh script

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)