CockroachDB Architecture and Performance Overview

What is CockroachDB?

It is a distributed SQL database built on a transactional and strongly-consistent key-value store. It supports consistent ACID transactions and provides a SQL API for structuring, manipulating, and querying data. It's completely open source. Cockroachdb scales horizontally, survives disk, machine, rack, and even data center failures with minimal latency disruption and no manual intervention.

An online database management system providing create, Read, Update, Delete (crud) operations that expose the graph data model. Click to explore about, Graph Database Architecture

Why it is important?

It offers fully-distributed ACID transactions, zero-downtime schema changes, and support for secondary indexes and foreign keys. It provides scale without sacrificing SQL functionality. It also supports JSON datatype to store NoSQL data.

What are the key features?

Simplified Deployment
Strong Consistency
Support of SQL
Distributed Transactions
Automated Scaling and Repair
High Availability
Open Source

How it Works?

It runs machines on two commands -

cockroach start with a --join flag for all of the initial nodes in the cluster, so the process knows all of the other machines it can communicate with cockroach init to perform a one-time initialization of the cluster.

Once the process is running, developers interact with CockroachDB through a SQL API. Send SQL RPC requests to any of nodes that are available due to the symmetric function that cockroach had, and this makes Cockroach easy to integrate with load balancers. After receiving SQL RPC requests, nodes convert them into operations that work with its distributed key-value store. As these RPCs start filling cluster with data, cockroach starts distributing data among available nodes, breaking the data up into 64 MiB chunks these chunks are also known as the range.

Each range replicated to at least three nodes. This way if any nodes go down, copies of the data is still there which can be used for reading operations and write operations as well as replicating the data to other nodes. If a node receives a read or writes request, it cannot directly serve. Cockroach finds the node that can handle the requests and communicates with it. Cockroach tracks everything and enables symmetric behavior for each node.

Any changes made to data in a range rely on a raft consensus algorithm to ensure a majority of its replicas or nodes agree to commit the change, ensuring industry-leading (best in this industry) isolation guarantees and providing application consistent reading regardless of which node used for communication. Ultimately, data is read from and written to disk using an efficient storage engine, which can keep track of the data's timestamp. It has the benefit of support the SQL standard AS OF SYSTEM TIME clause allowing and finding historical data for a period.

An online database management system providing create, Read, Update, Delete (crud) operations that expose the graph data model. Click to explore about, Database Testing Types

How does it scale?

It scales horizontally with minimum operator overhead. It runs on any local computer, single server, corporate development cluster, or a private-public cloud. Adding capacity in CockroachDB is as easy as pointing a new node at the running cluster. At the key-value level, it starts with a single, empty range. This unique range eventually reaches a threshold size (64 MiB by default) when data put in a database.

When that happens, the data split into two ranges, each range covering a contiguous segment of the entire key-value space and this process continues indefinitely as new data flows, existing ranges continue to split into new range aiming to keep a relatively small and consistent range size. When any cluster spans multiple nodes, newly split ranges are automatically rebalanced to nodes with more capacity. It communicates opportunities for re-balancing using a peer-to-peer gossip protocol by which nodes exchange network addresses, store capacity, and other information.

How does it survive failures?

It is designed to survive hardware and software failures from server restart to data center outages and accomplished without confusing artifacts typical of other distributed systems using replication as well as automated repair after failures.

Replication

It replicates data for availability and guarantees consistency between replicas using the Raft consensus algorithm. Various ways for defining the location of a cluster are -

Different servers tolerate server failures.
Different servers on different racks within a data center to tolerate rack power/network failures.
Different servers in different datacenters to endure large-scale network or power outages.

Database performance can be affected during the replication process done by itself, i.e., if the round-trip latency (delay during transferring of data) between data centers in more then database performance decreases and vice-versa.

Automated Repair

For short-term failures (server restart), it uses Raft to continue seamlessly as long as a majority of replicas remain available. For longer-term shortcomings (server/rack going down for an extended period or a data center outage), CockroachDB automatically re-balances replicas from the missing nodes using the unaffected replicas as sources. Capacity information from the gossip network identifies new locations in the cluster and all available nodes. Moreover, aggregate disk and network bandwidth of the cluster in the re-replication process done in a distributed fashion.

CockroachDB vs MongoDB vs PostgreSQL

	MongoDB	PostgreSQL	CockroachDB
Automated Scaling	Yes	No	Yes
Automated Failover	Yes	Optional	Yes
Automated Repair	Yes	No	Yes
Strongly Consistent Replication	No	No	Yes
Consensus-Based Replication	No	No	Yes
Distributed Transactions	No	No	Yes
ACID Semantics	Document-only	Yes	Yes
Eventually Consistent Reads	Yes	Yes	No
SQL	No	Yes	Yes
Commercial Version	Optional	No	Optional
Open Source	Yes	Yes	Yes
Support	Full	Full	Full

How to install?

The steps to install CockroachDB is defined below:

Download The Binary

Download its archive for Linux, and extract the binary -

$ wget -qO- https://binaries.cockroachdb.com/cockroach-v2.0.6.linux-amd64.tgz | tar xvz

Copy the binary into PATH so it's easy to execute cockroach commands from any shell -

$ cp -i cockroach-v2.0.6.linux-amd64/cockroach /usr/local/bin

Use Docker


$ docker version

$ sudo docker pull cockroachdb/cockroach:v2.0.6

Install Docker for Linux.
Confirm that the Docker daemon is running in the background -
Pull the image for the v2.0.6 release of CockroachDB from Docker Hub -

How to get geting Started?

Below are the steps to get started with CockroachDB Clusters:

Start First Node

Command to execute:-


$ cockroach start --insecure --host=localhost

Output -

The --insecure flag makes communication unencrypted.
cockroach-data directory stores node data.
--host=localhost conveys node to listen only on localhost, with default ports used for internal and client traffic (26257). HTTP requests from the Admin UI (8080).

Add more nodes to the cluster

Execute commands in two new terminals to add two more nodes -


$ cockroach start --insecure --store=node2 --host=localhost --port=26258 --http-port=8081 --join=localhost:26257

$ cockroach start --insecure --store=node3 --host=localhost --port=26259 --http-port=8082 --join=localhost:26257

Test the cluster by adding data using SQL

Open a new terminal and start SQL shell using inbuilt SQL shell of CockroachDB, execute the command -


$ cockroach sql --insecure

Then make a database using SQL commands, and create a table in that database and insert data in the table. Open SQL shell in other nodes to check data is there or not, execute the command in new terminal -

  
$ cockroach sql --insecure --port=26258

Step 4 -> Stop the clusters

Once the testing of cluster done, switch to the terminal running the first node and press CTRL-C to stop the node. At this point, with two nodes are still online, the cluster remains operational because the majority of replicas are available and to verify that the cluster has tolerated this "failure," open the built-in SQL shell of nodes 2 or 3. Do this in the same terminal or a new terminal. Execute below command for removing nodes' data and want a fresh cluster for further testing -

$ rm -rf cockroach-data node2 node3

Overview of Admin UI

Admin UI provides details about your cluster and database configuration and Real-Time metrics to monitor the following areas -

Node Map
Cluster Health
Hardware Metrics
Runtime Metrics
SQL Performance
Storage Utilization
Replication Details
Node Details
Database details
Statement details

Decisions Through Data-Small data, Predictive modeling expansion, and real-time analytics are three forms of data analytics Healthcare data will continue to accumulate rapidly. Click to explore about, Big Data and Predictive Analytics Solution

Concluding

It is easy to get started, and it is a fully distributed SQL database. It works well with Containers like Kubernetes. It is compatible with Postgresql. It has a fault tolerant system, very sophisticated replica placement rules. It is Cloud-ready and fully Open Source. CockroachDB is an excellent database to use.

Click To explore about Performance Tracing Tools
Explore more about various types of Databases

Interested in Solving your Challenges with XenonStack Team

Get Started

Interested in Solving your Challenges with XenonStack

Personalization

In Which Agentic Platform and Accelerator you are Interested? *

Which segment does your company belong to? *

What is your primary focus areas? *

At what stage is your AI use case currently in? *

What are the primary challenges in adopting AI? *

What kind of infrastructure does your organization currently using? *

Are you using any Data platform? *

Preferred Approach for AI Transformation *

In Which Domain your Solution/Organization belongs to in-terms of Data Privacy, Trustworthy AI *

Captcha Verification *

your request has been submitted successfully !

CockroachDB Architecture and Performance Overview

What is CockroachDB?

Why it is important?

What are the key features?

How it Works?

How does it scale?

How does it survive failures?

Replication

Automated Repair

CockroachDB vs MongoDB vs PostgreSQL

How to install?

Download The Binary

Use Docker

How to get geting Started?

Start First Node

Add more nodes to the cluster

Test the cluster by adding data using SQL

Step 4 -> Stop the clusters

Overview of Admin UI

Concluding

Share Article

Table of Contents

Share Article

Explore Related Topics

Navdeep Singh Gill

Subscribe to our Latest Technology Insights and Resources

Get the latest articles in your inbox

Related Articles

Rook Storage Orchestration for Kubernetes | An Essential Guide

IDP benefits and its key components

What is Integrated Development Environment (IDE)?