Kubernetes - Highlights

Short notes from various hyperlinked articles including primarily Kubernetes 101: Pods, Nodes, Containers, and Clusters by Daniel Sanche

Kubernetes - Highlights

Kubernetes is an open-source container orchestration platform designed to automate the deployment, scaling, and management of containerized applications.
Image source: Microsoft documentation

Nodes
  • A node is the smallest unit of computing hardware in Kubernetes. 
  • A node can be either a physical machine in a datacenter, or virtual machine. It is a representation of a single machine in your cluster. Containerization uses clusters of nodes (either real or virtual machines) that function like one robust server. 
  • Nodes share compute, network, and storage resources.

Cluster
  • In Kubernetes, nodes pool together their resources to form a more powerful machine. 
  • When you deploy programs onto the cluster, it intelligently handles distributing work to the individual nodes for you.
  • Each cluster has one master node connected to one or more worker nodes. 
  • The worker nodes are responsible for running groups of containerized applications and workloads, known as pods
  • The master node manages which pods run on which worker nodes.

Persistent Volumes
  • If a program tries to save data to a file for later, but is then relocated onto a new node, the file will no longer be where the program expects it to be. For this reason, the traditional local storage associated to each node is treated as a temporary cache to hold programs, but any data saved locally can not be expected to persist.
  • To store data permanently, Kubernetes uses Persistent Volumes. While the CPU and RAM resources of all nodes are effectively pooled and managed by the cluster, persistent file storage is not. Instead, local or cloud drives can be attached to the cluster as a Persistent Volume. This can be thought of as plugging an external hard drive in to the cluster. 
  • Persistent Volumes provide a file system that can be mounted to the cluster, without being associated with any particular node.

Containers
  • Programs running on Kubernetes are packaged as Linux or Windows containers. 
  • Containers are a widely accepted standard, so there are already many pre-built images that can be deployed on Kubernetes.
  • Containerization allows you to create self-contained execution environments. Any program and all its dependencies can be bundled up into a single file and then shared on the internet. Anyone can download the container and deploy it on their infrastructure with very little setup required. 
  • Creating a container can be done programmatically, allowing powerful CI and CD pipelines to be formed.
  • Multiple programs can be added into a single container, but you should limit yourself to one process per container if at all possible. 
  • It’s better to have many small containers than one large one. If each container has a tight focus, updates are easier to deploy and issues are easier to diagnose.

Pods
  • Unlike other systems you may have used in the past, Kubernetes doesn’t run containers directly; instead it wraps one or more containers into a higher-level structure called a pod
  • Any containers in the same pod will share the same resources and local network. Containers can easily communicate with other containers in the same pod as though they were on the same machine while maintaining a degree of isolation from others.
  • Pods are used as the unit of replication in Kubernetes. 
  • If your application becomes too popular and a single pod instance can’t carry the load, Kubernetes can be configured to deploy new replicas of your pod to the cluster as necessary. Even when not under heavy load, it is standard to have multiple copies of a pod running at any time in a production system to allow load balancing and failure resistance.
  • Pods can hold multiple containers, but you should limit yourself when possible. 
  • Because pods are scaled up and down as a unit, all containers in a pod must scale together, regardless of their individual needs. This leads to wasted resources and an expensive bill. To resolve this, pods should remain as small as possible, typically holding only a main process and its tightly-coupled helper containers (these helper containers are typically referred to as “side-cars”).

Deployments
  • Although pods are the basic unit of computation in Kubernetes, they are not typically directly launched on a cluster. Instead, pods are usually managed by one more layer of abstraction: the deployment.
  • A deployment’s primary purpose is to declare how many replicas of a pod should be running at a time. When a deployment is added to the cluster, it will automatically spin up the requested number of pods, and then monitor them. If a pod dies, the deployment will automatically re-create it.
  • Using a deployment, you don’t have to deal with pods manually. You can just declare the desired state of the system, and it will be managed for you automatically.
  • YAML file - The desired state for your Kubernetes cluster—the configuration of Pods—that you describe, which serves as the basis for a Kubernetes deployment.
  • Pods - The containers, shared resources, and environment your app or workflow needs to run.
  • ReplicaSet - Groups of identically configured pods are called ReplicaSets that ensure the type and number of pods described in the YAML file for a Kubernetes deployment are running at all times. If a pod fails, a new one is created.
  • Kube-controller-manager - Changes the current state of the cluster to match the desired state described in the YAML, creating new Pods and ReplicaSets as well as updating or removing existing ones.
  • Kube-scheduler - Determines how your pods and ReplicaSets are deployed among your worker nodes in addition to distributing traffic to those nodes.
  • Rollout - The process of reconfiguring the cluster from its current state to the desired state—achieved in most cases without downtime.

Ingress
  • By default, Kubernetes provides isolation between pods and the outside world. If you want to communicate with a service running in a pod, you have to open up a channel for communication. This is referred to as ingress.
  • The most common ways to add ingress to your cluster are by adding either an Ingress controller, or a LoadBalancer. 
Container runtime
  • A container runtime is software that executes containers and manages container images on a node. 
  • Docker is by far the most common container runtime used in production Kubernetes environments, but there are others like containerd, CRI-O

The Control Plane manages the worker node(s) and pods in the cluster. The Control Plane and cluster are spread out across multiple nodes and pods, providing high availability and reliability.

Kubernetes works on the principles of actual states and defined states. In a YAML/JSON format, the defined state is set. Then Kubernetes uses a controller to check the difference of the new YAML/JSON state-defined and the state in the cluster.

The kubectl (pronounced “cube CTL”, “kube control”, “cube cuttle”) command line tool lets you control Kubernetes clusters.

Kubernetes was born in the Cloud, and is what is known as a ‘Cloud native’ technology

Cloud Native Computing Foundation (CNCF), a Linux Foundation organization, manages Kubernetes and related open source projects. 

The CNCF Landscape provides a view of the many routes to deploying a cloud native application

The Open Container Initiative is an open governance structure for the express purpose of creating open industry standards around container formats and runtimes.

Helm is an open-source packaging tool that helps you install and manage the lifecycle of Kubernetes applications. Similar to Linux package managers such as APT and Yum, Helm is used to manage Kubernetes charts, which are packages of preconfigured Kubernetes resources.

Related reading:

Comments