Kubernetes Complete Mastery course

Kubernetes Production grade container orchestration

Kubernetes Beginners to Advanced

Kubernetes Architecture

  1. Master nodes
  2. Worker/Slave nodes

Master Node

It is responsible for the management of Kubernetes cluster.

The master node has various components like Controller Manager, Scheduler ,ETCD and API Server.

API Server

The API server has a RESTful interface, which means that many different tools and libraries can readily communicate with it.

It is the entrypoint for the tools and libraries to communicate.

Controller Manager

It regulates the state of the cluster.

A replication controller ensures that the number of replicas (identical copies) defined for a pod matches the number currently deployed on the cluster.

This can involve scaling an application up or down etc.

Scheduler

The process that actually assigns workloads to nodes in the cluster.

The scheduler will track the resource capacity for the nodes

ETCD

ETCD is a distributed key-value store. It’s mainly used for shared configuration and service discovery.

Worker Nodes

The worker node has various components like Kubelet, Kube-proxy, and Pods.

Kubelet

Kubelet gets the configuration of a Pod by interacting with API server and ensures that the desired containers are scheduled and running.

Kube-proxy

Kube-proxy acts as a network proxy and a load balancer which runs on each worker node and listens to the API server for each Service endpoint creation and deletion. It also involves in setting up routes to reach the service

Pods

A pod is one or more containers that logically run together on nodes.

Pro Tip: If you want to work on dev environment you can install minikube If you want for production purpose, there are managed solutions like Amazon Elastic container Service, Azure Kubernetes Service, Google cloud kubernetes Engine. All these are Managed service provide by them.

What is a pod?

A Pod is a grouping of one or more containers that operate together.

Namespace

Namespace are used to isolate the set of resources that we use in the kubernetes.

Don’t worry if you don’t understand, we will see everything in details as you read below.

ConfigMap

Consists a set of configurations that can be used by kubernetes object

Why do we need services in kubernetes

In the scenario below, you can see that there is a service of type NodePort and "app" containers running inside the pods.

When a worker node dies, the Pods running on the Node are also lost and what if the pod dies? The connection will be lost isn’t it.

Therefore you need a service to auto discover when a new pod is created with new ip address.

Service also does load balancing the traffic and service discovery for the pods.

Types of services are ClusterIP, NodePort, LoadBalancer, Ingress.

LoadBalancer

LoadBalancer is the old way of getting traffic into the network cluster.

ClusterIp

ClusterIp will allow any objects inside of our kubernetes cluster to access the object which ClusterIp is pointing at.

It exposes its own set of pods to other objects inside the kubernetes cluster.

You cannot access the clusterIp from external world i.e by browser.

NodePort

NodePort is the one which exposes the container to the outside world.

Ingress

Exposes a set of services to the outside world.

You will see examples for each services as you read below.

Deployment vs Pod vs Service

We will see the differentiation with the scenario below.

Flow Explained

In Deployment config file, you will specify about apps that you are hosting, containers with some specifications.

Specifically deployment will keep the pods alive and running.

Service object gives a virtual IP (cluster IP) for the pods that have a matching label that has been deployed with deployment object.

You need to have the service object because the pods from the deployment object can be killed, scaled up and down, and you can't rely on their IP addresses because they will not be persistent.

So you need an object like a service, that gives those pods a stable IP.

Pod and Service Config File

The configFile that we wrote below are used to create objects.

There are many object types like Pod, Service, ReplicaController, Deployment, ReplicaSet etc.

These object will serve purposes like running a container, creating a networking, replicating the container etc.

Now with this below yaml file, If I issue the command

kubectl apply –f <name-of-the-file>

It will create a pod and service object.

Config File Breakdown

In the above file , I have defined a pod and a service both in a single yaml file.

In line 1, apiVersion defines a different set of objects we can use. There is also one more apiVersion as “apps/v1” that will show another predefined set of objects that we can use.

In line 2, kind of object that we will create is Pod. When we load up the configFile in kubectl, It will create a object Pod inside a Node. Pod is nothing but grouping of containers.

In line 15, service will create a networking in kubernetes. In here we have listed service object spec type as NodePort.

This NodePort will expose the container to outside world. By this you can access the running container in your web browser. There are other types such as ClusterIP, LoadBalancer and Ingress.

In line 25, Selector property is defined with key-value as “name: nginx”. This selector property inside a service will search for a container with a labels as “name: nginx”. Once after finding , it will redirect the traffic. That is the reason we have defined the labels “name: nginx” in line 6.

In line 21, The port property means that if other pod needs to connect with our nginx it can connect it through 8080.

In line 22, targetPort means incoming request will be redirected to the Pod in which container port is 80.

In line 23, nodePort is the one which exposes the container to the outside world. In the browser you can provide http://hostname:<nodePort-Ip> to access the container which you are running. If you don’t assign the nodePort it will be randomly assigned.

Deploying

After that feed the config file to the kubectl by using the below command

kubectl apply –f <file-name>

ReplicaSet Vs Replication Controller vs Deployments

As you can see, Deployment is advanced than other two we are going to concentrate only on deployment.

Deployment

Deployment is a kubernetes object in which it runs a set of identical pods i.e one or more pods.

This can be used in production. A Deployment is a best way to handle High Availability (HA) when compared with ReplicaSets and ReplicationControllers.

Below is an example for deployment yaml file.

Breakdown of Deployment config

In line 2, the api version is "apps/v1"

In line 3, the object type is deployment

In line 10, replicas are the number of pods that this deployment is supposed to create.

In line 11, there is a template property contains the configurations in which it will be used for every single Pod that is created by the Deployment object. This template section is used for creating pods.

In line 12,13,14 the every pod that is created by the deployment will have label of “app: nginx”. To sum up, the template section is nothing but Pod template.

Basic commands

kubectl get pods – To get the list of running pods

kubectl gets pods –o wide – to get the ip address of a pod. For every pod created, there is a ip address assigned internally.

kubectl get services – To get list of running services

kubectl describe <object-type> - Get detailed information about an object

kubectl delete –f <config-file> - this command deletes the running object by passing the config file that created this object.

Persistent Volume

In kubernetes it is suggested to opt for Persistent Volume rather than Volumes.

The reason is persistent volume is not tied to any specific pod and will not get deleted when the pod dies.

Persistent volumes is separated from the Pod.

In order to use these PVs user needs to create PersistentVolumeClaims which is nothing but a request for Persistent Volumes.

A claim must specify the access mode and storage capacity etc.

once a claim is created, PV is automatically assigned(bound) to this claim.

There are two ways Persistent Volumes may be provisioned

statically or dynamically. You will see this below.

Difference between Persistent Volume claim and Persistent Volumes

In the above diagram, StorageClasses use provisioners that are specific to the storage platform or cloud provider to give Kubernetes access to the physical media(Hard disk) being used.

PersistentVolumeClaim(PVC) will be attached to a Pod config and kubernetes sees that PVC and will find whether the statically provisioned persistent volume is available or dynamically provisioned persistent volume is available to fulfill the requirements of the claim.

If you have a default Storage Class or you specify which storage class to use when creating a PVC, PV creation is automatic.

Provisioning

There are two ways Persistent Volumes may be provisioned: statically or dynamically.

Dynamic provisioning

When you're running minikube on your local machine, there is a default StorageClass set up.

This will dynamically provision new storage by allocating a portion of your hard drive.

On cloud providers, there are many more options for storage - that is where we can optionally define new storage classes.

When none of the static PVs the administrator created matches a user’s PersistentVolumeClaim, the cluster may try to dynamically provision a volume specially for the PVC. This provisioning is based on StorageClasses.

Below is the sample flow of dynamic provisioning

Sample flow for dynamic provisioning of file storage with the predefined standard storage class

Workflow Breakdown

In 1st flow, You can see that “claimName is pvc” and this claim will refer the 2nd flow.

In 2nd flow there is acessModes. Access mode defines how a pod consumes this volume

  1. • ReadWriteOnce – Mount a volume as read-write by a single node
  2. • ReadOnlyMany – Mount the volume as read-only by many nodes
  3. • ReadWriteMany – Mount the volume as read-write by many nodes

In 2nd flow, the storageClassName is "standard".

StorageClass allows for dynamic provisioning of PersistentVolumes.

If you do not specify this parameter it will take the default storage class that is in kubernetes cluster.

This default StorageClass is then used to dynamically provision storage for PersistentVolumeClaims that do not require any specific storage class.

You can List the StorageClasses in your cluster by below command

kubectl get storageclass

In 3rd flow, you can see that StorageClass contains the fields like provisioner (aka volume-plugin), parameters, and reclaimPolicy, which are used when a PersistentVolume belonging to the class has to be dynamically provisioned.

In 3rd flow, there is a type as gp2 which refers to general purpose 2 storage class in AWS.

Static Provisioning

A cluster admin creates a number of PVs.

They are ready made partitions of real storage which is available for use by cluster users.

Sample flow for static provisioning of file storage with the predefined standard storage class

In static way we are creating a PV manually and attach PVC to it, skipping the storage classes.

As you can see that First Pv’s are created and there is no storage class defined in the PVC. Now you can use that PVC in Pod config or deployment.

Once the deployment is up all the contents on container directory will be stored on persistent volume claim.

Secrets in kubernetes

Secrets are used for storing the passwords like ssh keys, certificates, database password etc.

The command to create secret is

kubectl create secret <type-of-secret> <secret-name> --from-literal key=value

For example

kubectl create secret generic mySecret --from-literal PASSWORD=foobar123

There are different types of secret like “tls”, “docker-registry”, “generic”.

  1. docker-registry - Create a secret for use with a Docker registry
  2. generic - Create a secret from a local file, directory or literal value
  3. tls - Create a TLS secret for https

Kuberenetes RBAC

We need a role based access system in order to limit the permission by defining who can access the objects in the cluster. This can be done by RBAC.

User Accounts

To administer a cluster an user can be given with some permissions.

Service Accounts

The permissions are given to a set of pods so that they can talk to other objects in kubernetes cluster

RoleBinding

An account or resources can be provided with set of permissions in a single namespace

ClusterRoleBinding

An account or cluster scoped resources (node) can be provided with set of permissions and across the entire cluster.

In the below command, we are creating a new service account with name as “foobar” in the default namespace.

kubectl create serviceaccount --namespace default foobar

In the below command, we are creating a new clusterrolebinding name as “foobar-cluster-role” and the clusterrole should be “cluster-admin” and the serviceaccount name as we created above as <namespace>:<serviceaccount-name> i.e “default:foobar”.

kubectl create clusterrolebinding foobar-cluster-role –clusterrole=cluster-admin –serviceaccount=default:foobar

Ingress in Depth

Ingress is an object that allows access to your Kubernetes services from outside the Kubernetes cluster. There is also other options that expose services to the external world they are NodePort, LoadBalancer and Ingress. But Ingress is far better than the other two.

As seen below diagram, When the Ingress config file is created with routing rules defined and feeded into kubectl, It invokes ingress controller behind the scenes to create a routing mechanism to accept the incoming traffic and routes the traffic to the services. This is the overall picture of ingress

Kubernetes Ingress vs LoadBalancer vs NodePort

These three will expose services in kubernetes to the external world.

In NodePort, Drawbacks is that you need to allocate port and you will be dealing with port management and it is not a robust solution.

In LoadBalancer, Drawbacks is that if you can set a service to be of type LoadBalancer it will create a Network Load Balancer with an IP address that you can use to access your service. Everytime if you want to expose a service, a new load balancer should be created with ip address assigned.

These are the reasons why one should opt for Ingress instead.

Ingress involves creating a Ingress controller and routing rules defines in it.

What is Helm?

It is a program to administer the third party system inside the kubernetes cluster.

When we install helm you will get two parts i.e “helm” and “tiller”. Helm is a client and tiller is a server.

You can simply think that helm makes it easy to install applications and resources into Kubernetes clusters.

Tiller will make those changes or modify changes in the kubernetes cluster, Generally you can think as a package manager.

Taints and Tolerations

Taints and tolerations allow the node to control which pods should (or should not) be scheduled on them.

You can imagine taint as “label to be applied on Node” and You can imagine toleration as “label to be applied on Pod”

If these two labels (Node lables, Pod labels) matches, then Node will allow pod to be scheduled. If it doesn’t match then it will refuse scheduling on that particular node.

For Example , If I have 3 nodes named as A, B , C and one pod. If I want this pod to be scheduled only on Node A, Then I need to apply taint through NodeSpec and also I need to apply tolerations to that Pod through PodSpec.

Taints and tolerations consist of a key, value, and effect.

kubectl taint nodes <node-name> <node-label>:<effect>

If you want to find the taint at a node , use the below command to find it.

kubectl describe nodes <your-node-name> | grep Taint

Below command applies a taint at a node.

kubectl taint nodes <your-node-name> node-role.kubernetes.io/master:NoSchedule

Node affinity

To get pods to be scheduled to specific nodes Kubernetes provides nodeAffinity.

With node affinity we can instruct Kubernetes in which nodes a pod should be scheduled using the labels on each node.

Docker Swarm stack file vs Kubernetes Configuration yaml file

Below is the docker swarm stack file which is created for jhipster-elasticsearch service and with external NFS volume with constraint to deploy in worker node only and user property added as “81226” to access the NFS server with this UID.

Below is the example for Kubernetes yaml file with Deployment object and service object created.

Config Breakdown for Kube yaml file

In line 15, tolerations added just to deploy on worker nodes only (Just for example). In default, master node is not allowed to schedule any pod on it because the taint is predefined on master node when kubernetes is setup.

In line 19, securityContext is added to access the NFS as this particular UID

In Line 29,30 Command and args are added to check whether it is accessing the NFS server by writing the date every 5s to the particular folder.