Kubernetes Deployment From Scratch - The Ultimate Guide (Part 1)

Konrad Rotkiewicz
18 October 2018 · 12 min read

Have you ever wondered how Kubernetes works inside? How much magic and unicorns are hidden there? Let's try to build small, not production-ready cluster from scratch, so that we can learn about Kubernetes internals.

This article is based on famous Kubernetes The Hard Way repo created by amazing Kelsey Hightower. This article aims to explain how Kubernetes works inside and not how to prepare production ready cluster.

What do you need before creating Kubertenes cluster from scratch?

First of all, we need to create DigitalOcean account, then install and set up the CLI. If you're familiar with any other cloud provider, feel free to use it instead. This is our first step in creating kubernetes from scratch.

Cloud

Before we start we need some nodes to work on. Let’s start with one node, replace --ssh-keys option with your SSH key fingerprint:

$ doctl compute droplet create k8s-master --region fra1 --size 2gb --image ubuntu-18-04-x64 --enable-private-networking --ssh-keys 79:29:54:77:13:2f:9c:b8:06:3e:8b:fe:8d:c0:d7:ba
$ doctl compute droplet list                                                                                                                                                             (env: st)
ID          Name          Public IPv4        Private IPv4     Public IPv6    Memory    VCPUs    Disk    Region    Image                      Status    Tags
63370004    k8s-master    46.101.177.76      10.135.53.41                    2048      2        40      fra1      Ubuntu 16.04.3 x64         active
$ ssh root@46.101.177.76

What is Kubelet in creating Kubernetes cluster?

This is the first and most important component in Kubernetes. Kubelet’s responsibility is to spawn/kill pods and containers on its node, it communicates directly with Docker daemon so we need to install it first.

root@k8s-master:~$ apt-get update && apt-get install -y \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common
    
root@k8s-master:~$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
    
root@k8s-master:~$ add-apt-repository \
    "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
    $(lsb_release -cs) \
    stable"
   
root@k8s-master:~$ apt-get update && apt-get install -y docker-ce

So let’s download Kubernetes binaries and run kubelet.

root@k8s-master:~$ wget -q --show-progress https://dl.k8s.io/v1.17.3/kubernetes-server-linux-amd64.tar.gz
kubernetes-server-linux-amd64.tar.gz                     100%[==================================================================================================================================>] 417.16M  83.0MB/s    in 5.1s
root@k8s-master:~$ tar xzf kubernetes-server-linux-amd64.tar.gz
root@k8s-master:~$ mv kubernetes/server/bin/* /usr/local/bin/
root@k8s-master:~$ rm -rf *

First of all, we have to create configuration file for Kubelet, open var/lib/kubelet/config.yaml and write:

apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: true
  webhook:
    enabled: false
authorization:
  mode: AlwaysAllow
kind: KubeletConfiguration
staticPodPath: /etc/kubernetes/manifests

This is the most basic Kubelet configuration, we're allowing anonymous authentication and turning off authorization here. With staticPodPath we set the directory that Kubelet will watch for pod manifest yaml files.

We run Kubelet with --config argument.

root@k8s-master:~$ kubelet --config=/var/lib/kubelet/config.yaml &> /tmp/kubelet.log &

Let’s put simple nginx pod manifest file to etc/kubernetes/manifests directory and see what happens.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx
    ports:
    - containerPort: 80

Now we can check docker ps to see that our container has been added and try to curl it:

root@node:~$ docker ps
CONTAINER ID        IMAGE                                                                           COMMAND                  CREATED             STATUS              PORTS               NAMES
c3369c72ebb2        nginx@sha256:aa1c5b5f864508ef5ad472c45c8d3b6ba34e5c0fb34aaea24acf4b0cee33187e   "nginx -g 'daemon off"   3 minutes ago       Up 3 minutes                            k8s_nginx_nginx-node_default_594710e736bc86ef2c87ea5615da08b1_0
b603d65d8bfd        gcr.io/google_containers/pause-amd64:3.0                                        "/pause"                 3 minutes ago       Up 3 minutes                            k8s_POD_nginx-node_default_594710e736bc86ef2c87ea5615da08b1_0

root@node:~$ docker inspect b603d65d8bfd | jq .[0].NetworkSettings.IPAddress
"172.17.0.2"
root@node:~$ curl 172.17.0.2
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

The b603d65d8bfd is the id of a pause container. This is an infrastructure container that Kubernetes creates first when creating a pod. Using a pause container Kubernetes acquires IP and setup network namespace. All other containers in a pod shares the same IP address and network interface. When all your containers die, this is the last container that holds whole network namespace.

This is how our node looks like now:

Kubernetes Deployment From Scratch - The Ultimate Guide
Kubernetes Deployment From Scratch - The Ultimate Guide

Kube API server

Kubernetes use etcd, a distributed database with strong consistency data model to store the state of whole cluster. API Server is the only component that can talk to etcd directly, all other components (including kubelet) have to communicate through API Server. Let’s try to run API Server with kubelet.

First we need etcd:

root@k8s-master:~$ wget -q --show-progress https://storage.googleapis.com/etcd/v3.4.3/etcd-v3.4.3-linux-amd64.tar.gz
etcd-v3.4.3-linux-amd64.tar.gz                           100%[==================================================================================================================================>]   9.70M  2.39MB/s    in 4.1s
root@k8s-master:~$ tar xzf etcd-v3.4.3-linux-amd64.tar.gz
root@k8s-master:~$ mv etcd-v3.4.3-linux-amd64/etcd* /usr/local/bin/
root@k8s-master:~$ etcd \
  --listen-client-urls http://0.0.0.0:2379 \
  --advertise-client-urls http://localhost:2379 \
  --enable-v2 \
  &> /tmp/etcd.log &
root@k8s-master:~$ etcdctl endpoint health
127.0.0.1:2379 is healthy: successfully committed proposal: took = 6.072508ms

And the API Server:

root@k8s-master:~$ kube-apiserver \
  --allow-privileged=true \
  --etcd-servers=http://localhost:2379 \
  --service-cluster-ip-range=10.0.0.0/16 \
  --bind-address=0.0.0.0 \
  --insecure-bind-address=0.0.0.0 \
  --disable-admission-plugins=ServiceAccount \
  --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \
  &> /tmp/apiserver.log &
root@k8s-master:~$ curl http://localhost:8080/api/v1/nodes
{
  "kind": "NodeList",
  "apiVersion": "v1",
  "metadata": {
    "selfLink": "/api/v1/nodes",
    "resourceVersion": "45"
  },
  "items": []
}

In our minimalistic kubernetes cluster we don't properly care about authentication and authorization, so we have to explicitly disable ServiceAccount admission plugin to be able to create Kubernetes objects (Pods, Deployments, etc.). Additionally, we will need --kubelet-preferred-address-types option later on when we'll add a second node. By default, API server tries to resolve node address by the hostname. That prevents us from checking logs and executing commands on pods scheduled on a different node if we don't have DNS set up.

WARNING: Do not try it on production, setting --insecure-bind-address to 0.0.0.0 allows external clients to access the API server omitting authentication and authorization.

Now we can connect kubelet to API Server and check if it was discovered by the cluster.

First of all, we need to create kubeconfig file. Kubeconfigs are configuration files managing access to the cluster. They store such information as API server address, user credentials etc.

root@k8s-master:~$ kubectl config set-cluster kubernetes \
  --server=http://localhost:8080 \
  --kubeconfig=kubelet.conf
root@k8s-master:~$ kubectl config set-context default \
  --cluster=kubernetes \
  --user=system:node:k8s-master \
  --kubeconfig=kubelet.conf
root@k8s-master:~$ kubectl config use-context default --kubeconfig=kubelet.conf
root@k8s-master:~$ mv kubelet.conf /etc/kubernetes
root@k8s-master:~$ pkill -f kubelet
root@k8s-master:~$ kubelet \
  --config=/var/lib/kubelet/config.yaml \
  --kubeconfig=/etc/kubernetes/kubelet.conf \
  &> /tmp/kubelet.log &
root@k8s-master:~$ kubectl get nodes
NAME        STATUS    AGE       VERSION
k8s-master  Ready     5m        v1.17.3
root@k8s-master:~$ kubectl get pods
NAME               READY   STATUS    RESTARTS   AGE
nginx-k8s-master   1/1     Running   0          26s

We still have nginx pod from file in /etc/kubernetes/manifests, so let’s move it outside and create it manually with kubectl.

root@k8s-master:~$ mv /etc/kubernetes/manifests/nginx.yaml .
root@k8s-master:~$ kubectl create -f nginx.yaml
pod "nginx" created
root@k8s-master:~$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
nginx     0/1       Pending   0          6m

Notice here that the pod hangs in Pending status – but why? This is because we don’t yet have another Kubernetes component responsible for choosing a node for the pod – Scheduler. We will talk about it later but for now we can just create nginx2 with updated manifest that determines what node should be used.

diff --git a/nginx.yaml b/nginx2.yaml
index 7053af0..36885ae 100644
--- a/nginx.yaml
+++ b/nginx2.yaml
@@ -1,10 +1,11 @@
 apiVersion: v1
 kind: Pod
 metadata:
-  name: nginx
+  name: nginx2
   labels:
     app: nginx
 spec:
+  nodeName: k8s-master
   containers:
   - name: nginx
     image: nginx
root@k8s-master:~$ kubectl create -f nginx2.yaml
root@k8s-master:~$ kubectl get pod
NAME      READY     STATUS    RESTARTS   AGE
nginx     0/1       Pending   0          10m
nginx2    1/1       Running   0          8s

Great, so now we can see that API Server and kubelet works. This is how our node looks like now:

Kubernetes deployment from scratch: How nodes is constructed? - Ulam Labs
Kubernetes deployment from scratch: How nodes is constructed? - Ulam Labs

What is Kube scheduler in creating Kubernetes?

Scheduler is responsible for assigning pod to a node. It watches pods and assigns available nodes to those without one.

We still have nginx pod that is in Pending state from previous example. Let’s run scheduler and see what happens. Note that we could also create kubeconfig file (and on production we should), as we did with kubelet. Instead, we will only pass --master option to specify API server address. As we don't have controller manager running yet, we will also have to manually remove not-ready taint from our node, which does not allow scheduler to run pods on that node.

root@k8s-master:~$ kubectl taint node k8s-master node.kubernetes.io/not-ready-
node/k8s-master untainted
root@k8s-master:~$ kube-scheduler --master=http://localhost:8080 &> /tmp/scheduler.log &
root@k8s-master:~$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
nginx     1/1       Running   0          17m
nginx2    1/1       Running   0          17m

As you can see the scheduler kicks in, finds a pod and assigns it to the node. You can see its placement on our node schema:

Kubernetes: How nodes is constructed? - Ulam Labs
Kubernetes: How nodes is constructed? - Ulam Labs

Kube Controller Manager and it's role in Kubernetes deployment

Controller Manager is responsible for managing (among others) Replication Controllers and Replica Sets so without it we can’t use Kubernetes Deployments.
Here we are going to run it and create a deployment.

At first, define the deployment manifest file nginx-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx
          ports:
            - containerPort: 80

Then create the deployment and check what has happened:

root@k8s-master:~$ kubectl apply -f nginx-deployment.yaml
root@k8s-master:~$ kubectl get deployment
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   0/1     0            0           3m18s

As you can see, we have the deployment, but its pod has not been created. This is where we need kube-controller-manager. As mentioned before, one of its main tasks is to watch on changes made to the kubernetes cluster and keep it in the desired state. Let's create it in the next step:

root@k8s-master:~$ kube-controller-manager \ 
  --master=http://localhost:8080 \
  &> /tmp/kube-controller-manager.log &
root@k8s-master:~$ kubectl apply -f nginx-deployment.yaml
root@k8s-master:~$ kubectl get deployment
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   1/1     1            1           19m
root@k8s-master:~$ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-85ff79dd56-89bg4   1/1     Running   0          4m9s

Updated version of our node scheme:

Kubernetes deployment: Node scheme - Ulam Labs
Kubernetes deployment: Node scheme - Ulam Labs

Scaling the deployment

One of main features of deployments is the ability to scale. We can in any moment set the number of pods in the deployment. Let's see the example:

root@k8s-master:~$ kubectl scale deployment nginx-deployment --replicas=3
deployment.apps/nginx-deployment scaled
root@k8s-master:~$ kubectl get deployment
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   3/3     3            3           28m
root@k8s-master:~$ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-85ff79dd56-89bg4   1/1     Running   0          9m7s
nginx-deployment-85ff79dd56-cfndz   1/1     Running   0          26s
nginx-deployment-85ff79dd56-rljtl   1/1     Running   0          26s

Kubernetes deployment automatically created 2 additional pods for us.

What is Kubernetes proxy?

Kubernetes (network) proxy is responsible for managing Kubernetes Services and thus internal load balancing and exposing pods internally for other pods and for external clients.

apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    nodePort: 30073
  selector:
    app: nginx
root@k8s-master:~$ kube-proxy --master=http://localhost:8080 &> /tmp/proxy.log &
root@k8s-master:~$ kubectl create -f nginx-svc.yaml
service "nginx" created
root@k8s-master:~$ kubectl get svc
NAME         CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes   10.0.0.1       <none>        443/TCP        2h
nginx        10.0.167.201   <nodes>       80:30073/TCP   7s

Nginx deployment is now exposed via 30073 port externally, we can check that with curl.

$ doctl compute droplet list                                                                                                                                                             (env: st)
ID          Name    Public IPv4        Private IPv4     Public IPv6    Memory    VCPUs    Disk    Region    Image                      Status    Tags
63370004    node1   46.101.177.76      10.135.53.41                    2048      2        40      fra1      Ubuntu 16.04.3 x64         active
$ curl http://46.101.177.76:30073
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

Our updated node schema:

Kubernetes - node schema - Ulam Labs
Kubernetes - node schema - Ulam Labs

Wrapping up on Kubernetes from scratch

That way we have gone through the steps of developing kubernetes from scratch. Finally, we have something that we can’t really call a cluster yet but we have learned how Kubernetes components works together. What is really astonishing is how well designed and decoupled Kubernetes parts are. After understanding each part’s role in the system Kubernetes should no longer be a mystery for us.
In the next blog post I will describe how to add more nodes to our kubernetes cluster and load balance ingress between them, so stay tuned!

Join our team

Related blogposts:

Kubernetes From Scratch Part 2 – Networking

Convox As A Solid Kubernetes Alternative

Kubernetes Federation With Google Global Load Balancer

Share on
Related posts
Convox As A Solid Kubernetes Alternative
CLOUD

Convox As A Solid Kubernetes Alternative

Are you an AWS user considering moving to Kubernetes ? Think twice before going down that road as there is a better alternative – Convox. AWS ECS To be more precise, the alternative to Kubernetes on…
6 min read
Elastic Beanstalk Is Outdated, Stay Away From It.
CLOUD

Elastic Beanstalk Is Outdated, Stay Away From It.

Three months ago a new client came to us with a project hosted on Amazon Elastic Beanstalk (EB). Since then we’ve learned a lot about this technology and today I’d like to share some thoughts with you…
3 min read

Talk to us about your project

Get in touch with us and find out how we can help you develop your software
Contact us