Pseudo Distributed cluster formation in HBase 94.X

Hi HBase Listeners,

This Article will provide you some basic information about the installation of HBase Pseudo cluster. Follow the step and get this Nosql database in your machine and start your processing.

Hbase-94x_pseudo_node_installation

HBase Cluster Setup :
Hbase cluster using Ubuntu machine

Step 1:
Download the HBase from its official Site. And untar it.

Download hbase-*.*.tar.gz
$tar –zxvf habse-*.*.tar.gz

$cd hbase-*.*/conf

Hadoop is needed in the machine, where you are going to install this HBase. If you have hadoop in your machine then ssh server setup might be completed, ifnot then you have to do the ssh configuration steps. You can find this ssh configuration in any of our hadoop installation materials.

Configure the property’s

Following are the important properties to change

hbase.rootdir
The directory shared by region servers and into which HBase persists. The URL should be ‘fully-qualified’ to include the filesystem scheme. For example, to specify the HDFS directory ‘/hbase’ where the HDFS instance’s namenode is running at namenode.example.org on port 9000, set this value to: hdfs://namenode.example.org:9000/hbase. By default, we write to whatever ${hbase.tmp.dir} is set too — usually /tmp — so change this configuration or else all data will be lost on machine restart.

hbase.master.port
The port the HBase Master should bind to

hbase.cluster.distributed
The mode the cluster will be in. Possible values are false for standalone mode and true for distributed mode. If false, startup will run all HBase and ZooKeeper daemons together in the one JVM.

hbase.zookeeper.quorum
Comma separated list of servers in the ZooKeeper ensemble (This config. should have been named hbase.zookeeper.ensemble). For example, “host1.mydomain.com,host2.mydomain.com,host3.mydomain.com”. By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper ensemble servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which hbase will start/stop ZooKeeper on as part of cluster start/stop. Client-side, we will take this list of ensemble members and put it together with the hbase.zookeeper.clientPort config. and pass it into zookeeper constructor as the connectString parameter.

hbase.zookeeper.property.maxClientCnxns
Property from ZooKeeper’s config zoo.cfg. Limit on number of concurrent connections (at the socket level) that a single client, identified by IP address, may make to a single member of the ZooKeeper ensemble. Set high to avoid zk connection issues running standalone and pseudo-distributed.

$vi hbase-site.xml

<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:50000/hbase</value>
</property>

<property>
<name>hbase.master.port</name>
<value>60001</value>
</property>

<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>

<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>

<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>35</value>
</property>

Set the JAVA path in Hbase Environment path

$vi .hadoop-env.sh

export JAVA_HOME=/home/bigdata/jdk1.6.0_(JavaVersion)

Change the host :

$sudo vi /etc/hosts

127.0.0.1

127.0.1.1 ——change this to—— 127.0.0.1

Start the Hadoop cluster

Start the Hbase cluster

$bin/start-hbase.sh

Below command ‘Jps’ is for listing JVM processes on local and remote machines.
$jps

Now you can find three HBase deamon’s will be running in your machine and the HBase deamon’s are:
HMaster
HRegionServer
HQuormPeer

—————————————-
Browser : localhost:60010

Start the Hbase CLI

$bin/hbase shell

hbase —-> list

OK
—————————–
Stop the Hbase cluster
$bin/stop-hbase.sh

———————————-

Article written by DataDotz Team

DataDotz is a Chennai based BigData Team primarily focussed on consulting and training on technologies such as Apache Hadoop, Apache Spark , NoSQL(HBase, Cassandra, MongoDB), Search and Cloud Computing.

Note: DataDotz also provides classroom based Apache Kafka training in Chennai. The Course includes Cassandra , MongoDB, Scala and Apache Spark Training. For more details related to Apache Spark training in Chennai, please visit http://datadotz.com/training/