High Level Steps
=================
1. Download Apache Hadoop and install.
2. Download Hbase and install
Note:
* No need to install separate zookeeper. Hadoop has built-in zookeeper
built-into NameNode.
Installing Hadoop
===================
Use core-site.xml like below::
fs.defaultFS
hdfs://localhost:9000
hadoop.tmp.dir
/home/hdfs/hadoop-tmp-dir
Use hdfs-site.xml like the following::
dfs.replication
1
dfs.namenode.rpc-bind-host
0.0.0.0
dfs.datanode.data.dir
file:///home/hdfs/base/dfs/data
dfs.namenode.name.dir
file:///home/hdfs/base/dfs/name
Note: Name node uses default service port number that is 8030 or 9000 or whatever.
If you use fs.defaultFS a.k.a fs.default.name property then you can force
specific port.
Installing Hbase
=================
hbase-env.sh
--------------
Edit conf/hbase-env.sh to uncomment the following line:
export HBASE_MANAGES_ZK=true
Above causes Hbase to run/manage it's own zookeeper.
hbase-site.xml
----------------
hbase.rootdir
hdfs://localhost:9000/hbase
This should match with hadoop hdfs fs.default.name or fs.defaultFS port.
hbase.zookeeper.property.dataDir
/home/hbase/hbase-zookeeper-data
hbase.cluster.distributed
true
hbase.zookeeper.quorum
localhost
Also see If HBASE_MANAGES_ZK is set in hbase-env.sh
hbase.zookeeper.property.clientPort
2181
default is 2181
hbase.master
localhost:60000
The host and port that the HBase master runs at.
default is 60000
hbase.tmp.dir
/home/hbase/hbase-tmp-dir
zookeeper.znode.parent
/hbase
hbase.master.ipc.address
0.0.0.0
hbase.regionserver.ipc.address
0.0.0.0
hbase.unsafe.stream.capability.enforce
false
Set to false for standalone mode. Must be true for hdfs.
Controls whether HBase will check for stream capabilities (hflush/hsync).
Disable this if you intend to run on LocalFileSystem, denoted by a rootdir
with the 'file://' scheme, but be mindful of the NOTE below.
WARNING: Setting this to false blinds you to potential data loss and
inconsistent system state in the event of process and/or node failures. If
HBase is complaining of an inability to use hsync or hflush it's most
likely not a false positive.
hbase.table.sanity.checks
false