Kubernetes Overview, Monitoring & Security
Kubernetes is an open-source container orchestration engine and also an abstraction layer for managing full stack operations of hosts and containers. From deployment, Scaling, Load Balancing and to rolling updates of containerized applications across multiple hosts within a cluster. Kubernetes make sure that your applications are in the desired state.
Kubernetes 1.7 released on 29th June 2017 with some new features and fulfil the most demanding enterprise environments. There are now new features related to security, stateful applications and extensibility. With Kubernetes 1.7 we can now store secrets in namespaces in much better way, We ‘ll discuss that below.
Table of Content -
- Introduction to Twelve-Factor App
- Kubernetes Architecture
- Kubernetes Components
- Kubernetes Security
- Monitoring of Kubernetes
- Open Source Tools For Kubernetes
- Applications of Kubernetes
- What's New in Kubernetes 1.7
Introduction To The Twelve-Factor App For Microservices
In the modern era, software is commonly delivered as a service called web apps, or software as a service. The twelve-factor app is a methodology for building software as a service app that -
Can minimize time and cost for new developers joining the project.
Offering maximum portability between execution environments.
Are suitable for deployment on modern cloud platforms.
Obviating the need for servers and systems administration.
Minimize divergence between development and production.
Enabling continuous deployment for maximum agility.
And can scale up without significant changes to tooling and architecture.
The twelve-factor methodology can be applied to apps written in any programming language, and which use any combination of backing services (database, queue, memory cache, etc).
The Twelve Factors
A twelve factor app should have only one codebase per app, but there will be many deploys of the app. A deploy is a running instance of an app.
A twelve-factor app is always tracked in a version control system. A copy of the revision tracking database is known as a code repository.
A twelve factor app never relies on implicit existence of system wide packages. It declares all dependencies, completely and exactly.
A twelve factor app should store config in the environment. An app’s config is everything that is likely to vary between deploys (staging, production, developer environments etc).
4. Backing Services
A backing service is any service the app consumes over the network as part of its normal operation. The code for a twelve-factor app makes no distinction between local and third party services.
A deploy of the twelve-factor app should be able to swap out a local MySQL database with one managed by a third party without any changes to the app’s code.
5. Build, Release, Run
The twelve-factor app uses strict separation between the build, release, and run stages. Transformation of code repo into an executable bundle is known as a build. Combination of build stage and current config makes release stage and the run stage runs the app in the execution environment.
The app is executed in the execution environment as one or more processes. Twelve-factor processes are stateless and share nothing. Any data that needs to persist must be stored in a stateful backing service typically a database.
7. Port Binding
The twelve-factor app is completely self-contained and does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port.
Processes in the twelve-factor app will be scale out via a process model. Using this model, the developer can architect their app to handle diverse workloads by assigning each type of work to a process type.
The twelve-factor app’s processes are disposable, meaning they can be started or stopped at a moment’s notice. This facilitates fast elastic scaling, rapid deployment of code or config changes, and robustness of production deploys.
10. Development / Production Parity
We have to keep development, staging and production as similar as possible. There should be less time gap, personnel gap and tool gap in development and production. It will help in continuous deployment.
Treat logs as event streams. A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream.
12. Admin Processes
Run admin or management tasks as one off processes. Admin tasks are such as running one time scripts, running database migrations, running a console to run arbitrary code.
Kubernetes Cluster operates in master and worker architecture. In which Kubernetes Master get all management tasks and dispatch to appropriate kubernetes worker node based on given constraints.
Below I have created two sections so that you can understand better what are the components of the kubernetes architecture and where we exactly using them.
Kubernetes Master Node Architecture
Kube API Server
Kubernetes API server is the centre of each and every point of contact to kubernetes cluster. From authentication, authorization, and other operations to kubernetes cluster. API Server store all information in the etcd database which is a distributed data store.
etcd is a database that stores data in the form of key-values. It also supports Distributed Architecture and High availability with a strong consistency model. etcd is developed by CoreOS and written in GoLang. Kubernetes components stores all kind of information in etcd like metrics, configurations and other metadata about pods, service, and deployment of the kubernetes cluster.
Kube Controller Manager
The Kube Controller Manager is a component of Kubernetes Cluster which manages replication and scaling of pods. It always tries to make kubernetes system in the desired state by using kubernetes API server.
There are other controllers also in kubernetes system like
Service accounts controller
The Kube Scheduler is another main component of Kubernetes architecture. The Kube Scheduler check availability, performance, and capacity of kubernetes worker nodes and make plans for creating/destroying of new pods within the cluster so that cluster remains stable from all aspects like performance, capacity, and availability for new pods.
It analyses cluster and reports back to API Server to store all metrics related to cluster resource utilisation, availability, and performance.
It also schedules pods to specified nodes according to submitted manifest for the pod.
The Kubernetes kubelet is a worker node component of kubernetes architecture responsible for node level pod management.
API server put HTTP requests on kubelet API to executes pods definition from the manifest file on worker nodes and also make sure containers are running and healthy. Kubelet talks directly with container runtimes like docker or rkt.
The Kube Proxy is networking component of the kubernetes architecture. It runs on each and every node of the kubernetes cluster.
It handles DNS entry for service and pods.
It provides the hostname, IP address to pods.
It also forwards traffic from Cluster/Service IP address to specified set of pods.
Alter IPtables on all nodes so that different pods can talk to each other or outside world.
Docker is an open source container runtime developed by docker. To Build, Run, and Share containerized applications. Docker is focused on running a single application in one container and container as an atomic unit of the building block.
Rocket is another container runtime for containerized application. Rocket is developed by CoreOS and have more focus towards security and follow open standards for building Rocket runtime.
Pluggable execution environment
It is a lightweight process management system that runs kubelet and container engine in running state.
Fluentd is an open source data collector for kubernetes cluster logs.
Kubernetes Nodes are the worker nodes in the kubernetes cluster. Kubernetes worker node can be a virtual machine or bare metal server.
Node has all the required services to run any kind of pods. Node is also managed by the master node of the kubernetes cluster.
Following are the few services of Nodes
A container is a standalone, executable package of a piece of software that includes everything like code, runtime, libraries, configuration.
1. Supports both Linux and Windows based apps
2. Independent of the underlying infrastructure.
Docker and CoreOS are the main leaders in containers race.
Pods are the smallest unit of kubernetes architecture. It can have more than 1 containers in a single pod. A pod is modelled as a group of Docker containers with shared namespaces and shared volumes.
A Deployment is JSON or YAML file in which we declare Pods and Replica Set definitions. We just need to describe the desired state in a Deployment object, and the Deployment controller will change the actual state to the desired state at a controlled rate for you.
Create new resources
Update existing resources
A Kubernetes Service definition is also defined in YAML or JSON format. It creates a logical set of pods and creates policies for each set of pods that what type of ports and what type of IP address will be assigned. The Service identifies set of target Pods by using Label Selector.
Example: - service.yml
A Replication Controller is a controller who ensures that a specified number of pod “replicas” should be in running state.
Pods should be running
Pods should be in desired replica count.
Manage pods on all worker nodes of the kubernetes cluster.
Example :- rc.yml
Labels are key/value pairs. It can be added any kubernetes objects, such as pods, service, deployments. Labels are very simple to use in the kubernetes configuration file.
Below mentioned code snippet of labels
Because labels provide meaningful and relevant information to operations as well as developers teams. Labels are very helpful when we want to roll update/restore application in a specific environment only. Labels can work as filter values for kubernetes objects. Labels can be attached to kubernetes objects at any time and can also be modified at any time.
Non-identifying information should be recorded using annotations.
Container Registry is a private or public online storage that stores all container images and let us distribute them.There are so many container registries in the market.
1. Docker Hub
2. AWS ECR
3. Google Container Registry
4. Azure Container Registry
Interaction with Kubernetes
Kubernetes is a collection of APIs which interacts with compute, network and storage.
There are so many ways to interact with the kubernetes cluster.
Direct Kubernetes API is available to do all tasks on the kubernetes cluster from deployment to maintenance of anything inside the kubernetes cluster.
Kubernetes Dashboard is simple and intuitive for daily tasks. We can also manage our kubernetes cluster from the kubernetes dashboard.
Kubernetes CLI is also known as kubectl. It is written in GoLang. It is the most used tool to interact with either local or remote kubernetes cluster.
Application on Kubernetes
Below mentioned deployment guides can be used to most of the popular language application on kubernetes.
Ruby on Rails
Monitoring of kubernetes
Kubernetes gives us an easier and managing infrastructure by creating many levels of abstractions such as node, pods, replication controllers, services. Nowadays due to this, we don’t worry about where applications are running or related to its resources to work properly. But in order to ensure good performance, we need to monitor our deployed applications and containers.
There are many tools like cAdvisor, grafana available to monitor the kubernetes environment with visualisation. Nowadays grafana is booming in the industry to monitor kubernetes environment.
cAdvisor is an open source tool to monitor kubernetes resource usage and performance. cAdvisor discovers all the deployed containers in the kubernetes nodes and collects the information like CPU, Memory, Network, file system. cAdvisor provides us with a visualise monitoring web dashboard.
Grafana is an open source metrics analytics and visualisation suite. Grafana commonly used for visualising time series data for application analytics. In grafana, we need a time series database like “influxdb” and a cluster-wide aggregator of monitoring and event data like heapster.
There are 4 steps to get information of kubernetes and visualise it to grafana dashboard.
Step 1: Hepster collects the cluster-wide data from the kubernetes environment.
Step 2: After collecting the data hepster provide it to influxdb.
Step3: And now grafana execute the metrics through the influxdb client to collect required data.
Step4: After getting required data grafana visualise the same in graphs.
You can create a custom dashboard on grafana as per your requirement.
What's New in Kubernetes 1.7
Security Enhancements in Kubernetes 1.7 -
- In etcd encryption for secrets - Now Kubernetes allow sensitive data stored in etcd key value store to be encrypted at the datastore level.
- Restriction on kublet - In 1.7 version there are two new plugins; node authoriser and admission control which restrict that access to pods, secrets and other objects.
- Provide data for system audit - Now store the audit logs are more customizable with event filtering and webhooks.
- Network control - A network plugin that allows the user to set enforce rules which help to pods to communication.
Role-Based Access Control (RBAC)
Role-Based Access Control (“RBAC”) uses the “rbac.authorization.k8s.io” API group to drive authorisation decisions, allowing admins to dynamically configure policies through the Kubernetes API.
Choose “RBAC” as the Authorisation Mode during deployment of your cluster to setup RBAC for your Kubernetes Cluster. Users can be accessed and configured the RBAC policies using kubectl or kubernetes dashboard and make authorization policies itself, making it possible to delegate resource management without giving away ssh access to the cluster master.
The RBAC system roles have been expanded to cover the necessary permissions for running a Kubernetes cluster with RBAC only. RBAC is a way of granting permissions to access Kubernetes API resources.
Stateful application handling enhancements are in Kubernetes 1.7
To handle stateful application there are new stateful set updates in kubernetes 1.7, currently in BETA though. With this feature, we can automate updates for stateful applications like etcd, kafka etc.
In this release, we also get local storage management which provides us ephemeral and durable access to the local storage. Container Runtime Interface is now in alpha.
Containers consume fewer resources than Docker – it’s a subset of Docker and does not bring any resources overhead.
How Can XenonStack Help You?
XenonStack Provides Kubernetes Consulting, Docker Consulting, Deployment and Managed Services on On-Premises as well as on Amazon Cloud, Google Cloud and Microsoft Azure.
XenonStack Provides Microservices architecture for Big Data Deployment on Kubernetes, Run Machine Learning and Deep Learning algorithms using Kubernetes, Web Hosting solutions with Kubernetes and Provides 24*7 Support for Managed Services with enterprises level SLA's.
XenonStack helps You to Migrate your existing Applications on microservices architecture and Deploy on Docker and Kubernetes. Contact for initial assessment and Consulting.
XenonStack Products & Solutions
Product NexaStack - Unified DevOps Platform Provides monitoring of Kubernetes, Docker, OpenStack infrastructure, Big Data Infrastructure and uses advanced machine learning techniques for Log Mining and Log Analytics.
Product ElixirData - Modern Data Integration Platform Enables enterprises and Different agencies for Log Analytics and Log Mining.
Product Akira.AI is an Automated & Knowledge Drive Artificial Intelligence Platform that enables you to automate the Infrastructure to train and deploy Deep Learning Models on Public Cloud as well as On-Premises.