Software Defined Storage and Rook Architecture

What is Software Defined Storage

Software-defined storage helps developers and enterprises to disconnect or abstract storage resources from the primary hardware resource for more flexibility, efficiency and faster scalability with the help of making storage resources programmable. Through this approach, it enables storage resources to be an integral part of a larger software-designed data center (SDDC) architecture, in which resources are automated and orchestrated easily without residing in siloes. The software-defined Storage is also known as the modern way of re-structuring your storage base by parting storage software from storage hardware. By implementing SDS model, you will see the following benefits:

Benefits of Software Defined Storage – Rook

  • Ability to scale storage both horizontally and vertically.
  • Faster and automatic provisioning of volumes for pods.
  • Automatic healing of cluster from failed or corrupted disks.
  • Easy and automated deployment.
  • Better utilization of resources.
  • Hyper-converged storage solution.

What is Rook?

Rook is software-defined storage designed solely for Kubernetes to provide storage to stateful containers. Rook deploys all ceph services daemon in containers including monitor, manager, RGW, MDS and OSD’s. Ceph is a petabyte scale storage which provides data replication and recovery without any human intervention. It gives all three storage service including block storage, object storage and file system making all in one storage solution. Its object storage is compatible with S3 and swift api calls which makes it easier it uses. Rook consumes HDDs and SSDs either separately or collectively while provisioning storage.

How Does Rook Work?

Rook runs the Kubernetes slave nodes and uses the disks as specified in the configuration. SSD’s used for journaling of the HDD for more and faster IOPS. It creates separate pools only containing SSD’s applications which require more IOPS and speed like databases and uses HDD for web applications or backups. For each disk, a pod runs which manages that disk and for each service like a monitor, a metadata server, Rados gateway and monitor a separate pod created.

Rados gateway is for providing object storage. S3 and swift API’s are compatible with the Ceph object storage. RGW integrated with LDAP for user management. Rook deployed either in a hyper-converged manner (running in the same node as other applications) or hyperscale by giving the complete node to Ceph services pods only. The hyper-converged solution provides better utilization of resources and reducing infrastructure cost. When running out of storage add more disks and updated the deployment file. For more compute and storage add a new node in the cluster.

It creates a storage class based on the pools. It can create separate storage class for SSD, HDD, and filesystem. Then use the storage class according to requirement while provisioning PVC.

 

How to Adopt Rook?

Versions used –

  • Ubuntu 16.04
  • Kubernetes 10.0.1
  • Ceph 12.2.4

These are the steps to get the installation up and running –

  • Build a Kubernetes Cluster.
  • Run a Docker Repository.
  • Build Rook.
  • Build Ceph.
  • Run a Rook Cluster.
  • Create a Rook Installation.
  • Create Block Storage.
  • Test Block Storage.
  • Create a File System.
  • Test File System.
  • Use Block Storage and the Filesystem on Ceph Rook installation.

Why Rook Matters?

  • Ceph and Kubernetes run with Rook. Rook is Cloud Native Storage Orchestrator. It extends Kubernetes with custom types of controllers. It performs automated deployment, Bootstrapping, Provisioning, Scaling, Upgrading, Migration, Disaster Management.
  • Cloud Native Foundation hosts it.
  • It is Open source Apache 2.0.
  • Rook Operator defines desired state for the Storage Cluster, Pod, Objects Store, etc.
  • The Operator runs reconciliation loops, watches for changes in the cluster, applies changes to the cluster to make desirable matches.
  • The Operator leverages the full power of Kubernetes including Services, Replicas sets, Daemon Set, Secrets.
  • It contains all the logic used to manage Ceph at Scale, handles stateful upgrades, handles rebalancing of the cluster.

Best Practices of Software Defined Storage – Rook

The basic setup of a Ceph cluster and to consume block, object and file storage from other pods running in a cluster with minimum Version of Kubernetes v1.8 or higher. If you are using data dirhost path to persist Rook data on Kubernetes hosts, make sure the host is at least 5GB available.

A simple Rook created with the following Kubectl commands –

  • Deploy the Rook Operator.
  • Create the Rook cluster.

Best Practices for Storage

  • Block – Create block storage to be consumed by Pods.
  • Object – Create an object which helps us to store that is accessible even inside as well as outside the cluster.
  • Shared File System – Create A File System to be shared with all multiple pods.
  • Ceph Dashboard – To view the status of the cluster created.
  • Tools – A toolbox container having the full suite of Ceph clients for debugging and troubleshooting the Rook Cluster.
  • Monitoring – Each Rook Cluster has built-in exporters for monitoring with Prometheus.
  • Teardown – To test the cluster.

Concluding Rook and Software Defined Storage

Rook is software-defined storage designed solely for Kubernetes to provide storage to stateful containers. Rook deploys all Ceph services Daemon in containers including a monitor, manager, RGW, MDS and OSD’s. Ceph is a Petabyte-scale storage which provides data replication and recovery without any human intervention. It provides all three storage service including block storage, object storage and file system making all in one storage solution. We at Xenonstack helps enterprises with the development and delivery of software-defined storage-based solutions to help enterprises take benefit of this new technology model. Before that, you are advised to take below steps: