Kubernetes Architecture
A Kubernetes cluster consists of two main components:
- Master (Control Plane)
- Worker Nodes.
Control Plane has following components. These components are responsible for maintaining the state of the cluster:
- etcd distributed key value store.
- API Server.
- Controller Manager
- Scheduler
Every worker
node
consists of the following components. These components are responsible for deploying and running the application containers.
- Kubelet
- Container Runtime (Docker)
There are few more components that are required for the cluster like kube-dns
and kube-proxy
, Ingress
and Dashboard
, we will discuss them in some other story.
Let’s discuss more about Master components.
etcd:
Kubernetes
use etcd
for storing the cluster status and metadata, which includes creation of any objects (pods, deployments, replication controllers, ingress etc…). etcd
is a distributed key value store that provides reliable way of storing data across a cluster of machines. As mentioned in the diagram API Server is the single entry point that can directly talk to etcd
store. Any other control plane component will go through API Server
only. K8s stores all its data under /registry directory in etcd
.
Api Server:
K8s Api Server
is the central place for all other components. Api Server
will take care about validating the object before saving the information to etcd
.
The client for the Api Server
can be either kubectl
(command line tool) or a Rest Api client.
As mentioned in the diagram there are several plugin’s that are invoked by Api Server before creating/deleting/updating the object in etcd
.
When we send a request for object creation operation to Api Server
, it needs to authenticate the client. This is performed by one or more authentication plugins. The authentication mechanism can be based on the client’s certificate or based on Basic authentication
using HTTP header “Authorization
”.
Once the authentication is passed by any of the plugins, it will be passed to Authorization plugins
. It validates whether user has access to perform the requested action on the object. Examples are like developers are not supposed to cluster role bindings or security policies. They are supposed to be controlled at the cluster level only by the ops team. Once the authorization passes the request will be sent to Admission Control PlugIns
(ACP).
Admission Control PlugIns
are responsible for initializing any missing fields or default values. For example, if we didn’t specify any Service Account
information in the object creation, one of the plugIns will take care about adding default service account to the resource specification. Finally API Server
, validates the object and stores it in etcd.
Api Server won’t initiate any requests for creating the pods/services. It’s the responsibility of controllers. In fact, it’s the responsibility of every control plane component to register for any changes that they are interested in. A Control plane component can request to be notified when a resource is created, modified or deleted. Clients watch for changes by opening a HTTP connection to the API server. Every time an object is updated, the Api Server uses this connection and sends the new version of the object.
Scheduler:
The scheduler’s
main job is to allocate what node the pods needs to be created. It registers with Api Server for any newly created object/resource.
Scheduler
figures out what node the pods needs to be created, using an algorithm. It checks whether the worker node has desired capacity or not. It checks whether the resource specification targeted any specific nodes with labels or affinity rules or any specific volumes like SSD. Finally after figuring out the node the scheduler will just update the resource specification and send it API Server
. The Api Server updates the resource specification and stores into etcd
. The Api Server notifies the kubelet
for the worker node selected by scheduler (using watch mechanism).
Controller Manager:
Controller Manager
is responsible to make sure the actual state of the system converges towards the desired state, as specified in the resource specification. There are several different controllers available under controller manager. Some of them are DeploymentControllers
, StatefulSet
Controllers, Namespace
Controllers, PersistentVolume
Controllers etc.
All controllers watch the API Server for changes to resources/objects and perform necessary actions like create/update/delete of the resource.
Worker Node components:
Kubelet:
Kubelet
registers the node it is running with the API Server. Kubelet
monitors the Api Server
for Pods that are scheduled to the node, and then it will start the pod’s containers by instructing to docker
runtime.
Kubelet
monitors the status of running containers and reports to api server about status, events and resource consumption. Kubelet
will also do health checks for the container and restart if needed.
Docker:
Docker
was the container runtime used by Kubelet
for spinning up Containers. Docker
is a platform for packaging, distributing and running applications. Docker
based container images contains application code, file system required and application metadata. A docker registry
is a repository that stores docker images
and allows us to share the images over the internet as public. A registry can be public or private (accessible only within the organization).