Kubernetes is a portable and open-source platform that is used for managing containerized services and workloads. Kubernetes tools, support, and services are widely available Kubernetes. The name originates from the Greek meaning Pilot or Helmsman.
Kubernetes has a scheduling system that makes sure that pods are matched to Nodes because of which Kubelet can run them. The scheduler is used to watch newly created Pods that have no assigned Nodes.
What is Custom Kubernetes Scheduler?
At first, the scheduler may sound intimidating, but if you follow the steps properly, you will find that creating something which schedules pods that follow some simple rule is quite easy.
At first, the pod is created, and the desired state of the pod is saved to etcd, and the node name remains unfilled.
Then the scheduler notices that there is a new pod available with no node bound.
Then the scheduler finds the best Node that fits that pod.
The scheduler tells the apiserver which binds the pod with the Node and saves the desired state in etcd.
Kubelets watch the bound pods using the apiserver and start the containers on the Node.
Implement the loop used to watch the unbound pods available in the cluster, querying the apiserver.
Apply some custom logic that finds the suitable and best Node for the pod.
And requests which bind endpoints on the apiserver.
The below are the in depth details to Kubernetes scheduler and Controller
The Kubernetes scheduler is used to ensure that the Pods are connected to Nodes so that Kubelet can run them. It is responsible for finding the best Node for the pod to run on. It can act as a control panel. The scheduler determines the nodes are valid placements for each pod present in the scheduling queue according to available resources and constraints. After that, the scheduler ranks every valid Node and binds the pod to a suitable Node.
The controllers are in the form of control loops that watch the state of the cluster and make or request necessary changes. Every controller tries to move the current cluster state towards the desired state.
In the Kubernetes controller, the controller will send a message to the API server. The job controller, Kubernetes built-in controller, manages the state by interacting with the cluster API server. The job is a Kubernetes resource that runs Pods whenever the job controller sees the new task. It makes sure that kubelet on a set of Nodes are running the right containers or Pods themselves. Job controller tells the API server to create and remove the Pods, Job Controllers are also used to configure the object and update them. Once the work is completed, the job controller changes the job object to mark it finished.
If the default scheduler does not suit our needs, we can implement our scheduler. We can run multiple schedulers simultaneously, with default schedulers instruct kubernetes about what scheduler to use for every pod.
Multi-Cluster Scheduler→ multi-cluster scheduler is a system of Kubernetes controllers. In this system, the Kubernetes controller intelligently schedules the workload in the clusters. Using a virtual-kubelet provider for Multicluster-scheduler changes the elected pods into proxy pods in virtual-kubelet and creates delegate pods in remote clusters, the feedback loop updates the animations of the proxy pods and the status to show the annotations and status of the delegate pods. Agent present in the target cluster is responsible for creating delegate pods. The specifications of the delegate pods are the same as the original pods, and they are spread across the topology; it can hold the proxy pods along with the delegates if the agent is deployed in the primary cluster.
Do we need multiple Kubernetes clusters?
If we have only one Kubernetes cluster, we need only one resource copy of every needed to manage and run the Kubernetes cluster.
If we have a single cluster, we can reuse services for all work. We don't need multiple copies of services.
Kubernetes Scheduler Policy
A scheduling policy is used to specify priorities and predicates which Kube-scheduler runs to score and filter the nodes.
Predicates→It is used for filtering
PodFitsHostPort is used to Check that a Node has a free port.
A PodFitsHost is used to check if a Pod specifies a Node by its hostname.
PodFitsResources are used to check that the Node has the free resources to fulfill the requirement of pods.
A MatchNodeSelector is used to check if the Pod Node Selector matches with the Mode labels.
NoDiskConflict is used to check and evaluate whether the Pods can fit on a Node due to the volume it requested.
MaxCSIVolumeCount is used to decide how many CSI volumes should be attached.
Priorities →it is used for scoring.
SelectorSpreadPriority is used to spread across the host considering Pods that belong to the same stateful set, ReplicaSet, and Service.
InterPodAffinityPriority is used to implement inter-pod anti-affinity and affinity.
LeastRequestedPriority is used to favor the nodes with fewer requested resources.
MostRequestedPriority is used to favor nodes with the most requested resources.
Custom Kubernetes Scheduler for Mission-Critical Workloads
Kubernetes is known for its powerful ability to manage the workload, and it provides many extension mechanisms for developers so that they can customize their organization's needs.
For creating a custom kubernetes scheduler, there are four ways.
The first way is to clone the upstream code, modify it, and then recompile it to run the scheduler. This is not a good practice, and this is not recommended.
The second way is to run the scheduler with the default scheduler. There can be many tricky issues when pods are scheduled into the same Node by multiple schedulers. The custom scheduler and the default scheduler cover the respective pods exclusively but having multiple schedulers can create problems like cache synchronization and distributed lock. Creating and maintaining a high-quality custom kubernetes scheduler won't be an easy task, and it also requires a comprehensive understanding of the default kubernetes scheduler.
Third, it is known that the scheduler extender is the most comfortable solution in this situation with little effort, and it is also compatible with the upstream scheduler. The name scheduler extender means configurable webhooks in which there is a use of Predicates and Priorities.
The fourth way is using a third-party framework.
How to create a custom scheduler using scheduler extender?
First, we create a config file where we pass the parameter. It can be a local file, and it depends upon how the scheduler is deployed.
The file should be in the form of json format for now.
The policy file is defined at HTTP extender, and it runs at localhost:8888.
We can write a program in any language.
We have to create a handler that is going to respond to filters and prioritize.
We create the function to filter and prioritize. Inside the function, we iterate each Node and create the logic in which we approve the Node.
The filter function is to filter the nodes, and it passes only those nodes which are approved.
The two functions above are the most critical functions to extend the behavior of the default scheduler.
Kubernetes is known for its powerful ability to manage all workloads. It provides many series of extensions mechanisms to developers to customize their business development and needs, When someone thinks to write a kubernetes scheduler, they usually think that it is hard,
They think it will probably need some kind of API that other components can handle. Kubernetes scheduler does only one job: find a node from all pods present in the cluster and let the apiserver know about it, and kubelet and Episerver will take care of everything to start the actual containers. We can write a program in any language.