Apache HBase Architecture, Securing HBase with Kerberos

Overview of Apache Hbase

Apache HBase is a column-oriented NoSQL database. This seems similar to the relational database, but this stores Data in a column-oriented approach. This is written in Java and is open source, distributed the multi-dimensional database.HBase provides BigTable like capabilities and runs at the top of HDFS(Hadoop Distributed File System). To need fast and random access to the data, HBase is the best choice as it provides high throughput and low latency on reading/write operations. Apache HBase consists of the keys and values and each key points to an amount which can be an array of bits or can be strings. Thus we can say that large data sets are stored in the Hbase, and this stored data can be sharable.

Guide to Apache HBase Architecture

There are mainly three components present in Hbase Architecture –

  • HMaster
  • Region Server
  • Zookeeper

Hmaster

It monitors all the region servers that are present in the HBase cluster. Hmaster is a kind of Master Server in the HBase. This assigns regions to the region servers and performs all the DDL operations like (creating, deleting table, etc.). It also manages several background threads. It has also featured like controlling load balancing/ failover cases etc.

Region Server

Region Servers runs on HDFS Datanodes which is present in Hadoop clusters. The Default size of the regions is 256 MB. The Tables of HBase are
divided into horizontally into row key range into areas. HBase cluster is mainly the buildup of Regions that are consisting of Tables and are present column families. Region server operates read/write operations and is also responsible for handling, managing HBase operations.

Apache Zookeeper

It plays the role of the coordinator in the HBase. It provides services like maintaining configuration information, naming, providing distributed synchronization, server failure notification, etc. Zookeeper acts as an intermediate of the communication between clients and region servers, i.e., the client communicates with region servers via Zookeeper.


Securing Apache HBase with Kerberos

Install mapr-hbase-master and mapr-hbase-regionserver packages on the cluster.

Now on the Hbase nodes we have to perform the following function

  • Install Krb5 package and configure Kerberos
  • Now we will be setting up Hbase Kerberos principal mapr/@. There will be unique keytab and kerberos identity for each node.
  • Now generate the hbase.keytab file with Hbase Kerbros Principal.
  • Copy that hbase.keytab file to /opt/mapr/conf directory.
  • Now change the ownership of the keytab file by using chown.
  • Set 600 permissions to keytab file by using chmod.
  • Update hbase-site.xml file by adding following lines to it.

<property>
<name>hbase.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hbase.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hbase.rpc.engine</name>
<value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
</property>
<property>
<name>hbase.regionserver.kerberos.principal</name>
<value>mapr/_HOST@<KERBEROS_REALM></value>
</property>
<property>
<name>hbase.master.kerberos.principal</name>
<value>mapr/_HOST@<KERBEROS_REALM></value>
</property>

On a MapR cluster with security characteristics equipped, substitute the ${SIMPLE_LOGIN_OPTS} value of the MAPR_HBASE_SERVER_OPTS property with ${KERBEROS_LOGIN_OPTS} and the value of the MAPR_HBASE_CLIENT_OPTS property with ${HYBRID_LOGIN_OPTS}. Further eliminate the Dzookeeper.sasl.client=falseThe decision from the description of MAPR_HBASE_CLIENT_OPTs These resources are positioned in the /opt/mapr/conf/env.sh file.

On a MapR cluster with security features disabled, replace the ${SIMPLE_LOGIN_OPTS} value of the MAPR_HBASE_SERVER_OPTS and MAPR_HBASE_CLIENT_OPTS properties in the /opt/mapr/conf/env.sh file with ${KERBEROS_LOGIN_OPTS}.

Add the following section to hbase regionserver nodes in hbase-site.xml section

<property>
<name>hbase.regionserver.keytab.file</name>
<value>/opt/mapr/conf/hbase.keytab</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>
org.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController</value>
</property>

Same updation we have to do in Hbase master node in hbase-site.xml

<property>
<name>hbase.master.keytab.file</name>
<value>/opt/mapr/conf/hbase.keytab</value>
</property>
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>

Restart Hbase Master and RegionServer Nodes.

A Distributed Approach

A distributed and scalable platform helps Enterprises to enable real-time read/to write access to large datasets which further helps to improve consistency and scalability. To know more about distributed platforms we recommend taking the following steps –


Leave a Comment

Name required.
Enter a Valid Email Address.
Comment required.(Min 30 Char)