内容概括: 这次笔记记录Zookeeper的安装配置(在分布式环境中主要是完成对hadoop 的name node 和 Hbase ),Hbase的使用配置。当这些配置完成再使用eclipse创建一个demo来操作Hbase。
对于hadoop hdfs的配置和jdk请参考前面的笔记。这里还是基于前面的伪分布式hdfs来配置Hbase。
一 Zookeeper的安装:
Zookeeper 属于管理集群的工具,这里参考IBM里面的一篇文章简单列出Zookeeper的几个主要功能:
- 统一命名服务(Name Service)
- 配置管理(Configuration Management)
- 集群管理(Group Membership)
- 共享锁(Locks)
- 队列管理
这里只是描述单机模式,只是用来替换Hbase内置的Zookeeper而已,所以并没有什么作用只是用作了解,至于上面所提到的功能和分布式模式将在后面学习hadoop 高可用分布式时介绍。
$ cd /usr/local/
$ ls
bin games include lib64 sbin src
etc hadoop lib libexec share zookeeper-3.4.8.tar.gz
$ sudo tar -zxf zookeeper-3.4.8.tar.gz -C .
#重命名zookeeper-3.4.8 为zookeeper
$ sudo mv zookeeper-3.4.8 zookeeper
bin games include lib64 sbin src
etc hadoop lib libexec share zookeeper zookeeper-3.4.8.tar.gz
$ sudo chown -R hadoop:hadoop zookeeper
$ ll
drwxr-xr-x. 2 root root 27 3月 24 11:04 bin
drwxr-xr-x. 2 root root 6 8月 12 2015 etc
drwxr-xr-x. 2 root root 6 8月 12 2015 games
drwxr-xr-x. 13 hadoop hadoop 4096 6月 19 16:15 hadoop
drwxr-xr-x. 3 root root 17 3月 24 11:04 include
drwxr-xr-x. 3 root root 25 3月 24 11:04 lib
drwxr-xr-x. 2 root root 6 8月 12 2015 lib64
drwxr-xr-x. 2 root root 6 8月 12 2015 libexec
drwxr-xr-x. 2 root root 6 8月 12 2015 sbin
drwxr-xr-x. 7 root root 72 3月 24 11:04 share
drwxr-xr-x. 2 root root 6 8月 12 2015 src
drwxr-xr-x. 10 hadoop hadoop 4096 8月 2 01:46 zookeeper
$ cd zookeeper
$ mv conf/zoo_sample.cfg conf/zoo.cfg
$ ./bin/zkServer.sh start
3049 Jps
2315 QuorumPeerMain
$ ./bin/zkCli.sh -server
Connecting to
2016-08-02 06:47:40,409 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT
2016-08-02 06:47:40,412 [myid:] - INFO [main:Environment@100] - Client environment:host.name=master
2016-08-02 06:47:40,412 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.8.0_71
2016-08-02 06:47:40,414 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2016-08-02 06:47:40,414 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-1.8.0-openjdk-
2016-08-02 06:47:40,414 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/usr/local/zookeeper/bin/../build/classes:/usr/local/zookeeper/bin/../build/lib/*.jar:/usr/local/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/usr/local/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/local/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper/bin/../zookeeper-3.4.8.jar:/usr/local/zookeeper/bin/../src/java/lib/*.jar:/usr/local/zookeeper/bin/../conf:.:/usr/lib/jvm/java-1.8.0-openjdk/lib/dt.jar:/usr/lib/jvm/java-1.8.0-openjdk/lib/tools.jar
2016-08-02 06:47:40,414 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-08-02 06:47:40,414 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2016-08-02 06:47:40,414 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2016-08-02 06:47:40,414 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
2016-08-02 06:47:40,415 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64
2016-08-02 06:47:40,415 [myid:] - INFO [main:Environment@100] - Client environment:os.version=3.10.0-327.el7.x86_64
2016-08-02 06:47:40,415 [myid:] - INFO [main:Environment@100] - Client environment:user.name=hadoop
2016-08-02 06:47:40,415 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/home/hadoop
2016-08-02 06:47:40,415 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/usr/local/zookeeper
2016-08-02 06:47:40,416 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString= sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@531d72ca
Welcome to ZooKeeper!
2016-08-02 06:47:40,437 [myid:] - INFO [main-SendThread($SendThread@1032] - Opening socket connection to server Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2016-08-02 06:47:40,527 [myid:] - INFO [main-SendThread($SendThread@876] - Socket connection established to, initiating session
[zk: 0] 2016-08-02 06:47:40,614 [myid:] - INFO [main-SendThread($SendThread@1299] - Session establishment complete on server, sessionid = 0x1564a6ef9ba0000, negotiated timeout = 30000
WatchedEvent state:SyncConnected type:None path:null
[zk: 0]
#然后输入 -h就会出现相关Zookeeper的命令说明:
ZooKeeper -server host:port cmd args
stat path [watch]
set path data [version]
ls path [watch]
delquota [-n|-b] path
ls2 path [watch]
setAcl path acl
setquota -n|-b val path
redo cmdno
printwatches on|off
delete path [version]
sync path
listquota path
rmr path
get path [watch]
create [-s] [-e] path data acl
addauth scheme auth
getAcl path
connect host:port
$ ./bin/zdServer.sh stop
二 Hbase安装配置:
$ cd /usr/local/
bin games hbase-1.2.2-bin.tar.gz lib libexec share zookeeper
etc hadoop include lib64 sbin src
$ sudo tar -zxvf hbase-1.2.2-bin.tar.gz
$ sudo mv hbase-1.2.2 hbase
$ ls
bin games hbase include lib64 sbin src
etc hadoop hbase-1.2.2-bin.tar.gz lib libexec share zookeeper
$ sudo chown -R hadoop:hadoop hbase
$ ll
drwxr-xr-x. 2 root root 27 3月 24 11:04 bin
drwxr-xr-x. 2 root root 6 8月 12 2015 etc
drwxr-xr-x. 2 root root 6 8月 12 2015 games
drwxr-xr-x. 13 hadoop hadoop 4096 6月 19 16:15 hadoop
drwxr-xr-x. 7 hadoop hadoop 4096 8月 2 18:06 hbase
-rwxr-xr-x. 1 hadoop hadoop 108478494 8月 2 17:59 hbase-1.2.2-bin.tar.gz
drwxr-xr-x. 3 root root 17 3月 24 11:04 include
drwxr-xr-x. 3 root root 25 3月 24 11:04 lib
drwxr-xr-x. 2 root root 6 8月 12 2015 lib64
drwxr-xr-x. 2 root root 6 8月 12 2015 libexec
drwxr-xr-x. 2 root root 6 8月 12 2015 sbin
drwxr-xr-x. 7 root root 72 3月 24 11:04 share
drwxr-xr-x. 2 root root 6 8月 12 2015 src
drwxr-xr-x. 10 hadoop hadoop 4096 8月 2 01:46 zookeeper
$ cd /usr/local/hbase/
$ vi conf/hbase-env.sh
#我这里修改为:export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
#2修改HBASE_MANAGES_ZK配置为:export HBASE_MANAGES_ZK=false ,使其不适用内置Zookeeper。
$ vi ~/.bashrc
export HBASE_HOME=/usr/local/hbase
$ source ~/.bashrc
<description>指定HBase运行的模式:false: 单机模式或者伪分布式模式 true: 全分布模式</description>
$ hbase version
HBase 1.2.2
Source code repository git://asf-dev/home/busbey/projects/hbase revision=3f671c1ead70d249ea4598f1bbcc5151322b3a13
Compiled by busbey on Fri Jul 1 08:28:55 CDT 2016
From source with checksum 7ac43c3d2f62f134b2a6aa1a05ad66ac
#查看linux limit相关配置
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7227
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
#这里我们需要调整两个属性:open files 和 max user processes
$ sudo vi /etc/security/limits.conf
#修改/etc/security/limits.conf 添加如下内容:
* - nofile 65535
* - nproc 65000
#说明 *表示所有用户 -表示soft/hard都要配置 nofile表示打开文件个数 nproc 表示线程个数
#修改线程配置必须也要修改这个文件/etc/security/limits.d/20-nproc.conf,不然除了root用户其他用户的max user processes还是原始的值。修改内容如下:
#把 * soft nproc 4096 改为 * soft nproc 65000
$ sudo reboot
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7227
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 65000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
#这里open files 和 max user processes 都已经设置成功
启动hbase之前请先保证hadoop中hdfs MapReduce YARN HistoryServer Zookeeper 等启动成功,这些的配置和启动请参考之前的笔记。这里启动出现了一些警告这里有个链接是对这个警告的一些说明,这里先不管这些警告,如果不想出现这些警告可以使用jdk7。
$ cd /usr/local/hbase
$ ./bin/start-hbase.sh
starting master, logging to /usr/local/hbase/logs/hbase-hadoop-master-master.out
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
starting regionserver, logging to /usr/local/hbase/logs/hbase-hadoop-1-regionserver-master.out
$ jps
3536 JobHistoryServer
3233 NodeManager
2563 NameNode
3059 ResourceManager
2868 SecondaryNameNode
3636 QuorumPeerMain
3943 HRegionServer
4135 Jps
2668 DataNode
3821 HMaster
#单机模式下出现HMaster 和 HRegionServer 表示已经启动。
通过浏览器观察hbase的状态,在浏览器中输入http:// 则会出现以下内容:
三 Hbase shell操作
这里有一些关于hbase shell命令的说明的网站,英文教程里面的说明 ,个人觉得介绍得最详细的中文版,个人觉得比较详细的英文版
#进入hbase shell控制台
$ hbase shell
2016-08-06 07:47:59,323 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.2, r3f671c1ead70d249ea4598f1bbcc5151322b3a13, Fri Jul 1 08:28:55 CDT 2016
hbase(main):001:0> create 'test','cf'
0 row(s) in 2.4900 seconds
=> Hbase::Table - test
hbase(main):003:0> put 'test' , 'row1', 'cf:a' , 'value1'
0 row(s) in 0.2130 seconds
hbase(main):004:0> scan 'test'
row1 column=cf:a, timestamp=1470497082748, value=value1
1 row(s) in 0.0420 seconds
四 Hbase demo
这里我们通过java代码来完成上面shell类似的功能,在eclipse 远程连接hbase时需要在本地host文件中添加( master), 服务器名和ip的映射(master是我虚拟机的名称)。下面是我在一个翻译官方文档的博客里面复制的,这里笔记采用了最简单粗暴的方式,有更好的方式可能会在后面给hbase 添加权限控制的部分再展示:
maven pom:
<!--这是在之前eclipse 连接hdfs执行MapReduce的基础上添加下面依赖-->
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.MasterNotRunningException;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.ZooKeeperConnectionException;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp;
import org.apache.hadoop.hbase.filter.Filter;
import org.apache.hadoop.hbase.filter.FilterList;
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter;
import org.apache.hadoop.hbase.util.Bytes;
public class HBaseClient {
// 声明静态配置,配置zookeeper
static Configuration configuration = null;
static Connection connection = null;
static {
configuration = HBaseConfiguration.create();
configuration.set("hbase.zookeeper.quorum", "");
try {
connection = ConnectionFactory.createConnection(configuration);
} catch (IOException e) {
* 创建表
* @param tableName
public static void createTable(String tableStr, String[] familyNames) {
System.out.println("start create table ......");
try {
Admin admin = connection.getAdmin();
TableName tableName = TableName.valueOf(tableStr);
if (admin.tableExists(tableName)) {// 如果存在要创建的表,那么先删除,再创建
System.out.println(tableName + " is exist,detele....");
HTableDescriptor tableDescriptor = new HTableDescriptor(tableName);
// 添加表列信息
if (familyNames != null && familyNames.length > 0) {
for (String familyName : familyNames) {
tableDescriptor.addFamily(new HColumnDescriptor(familyName));
} catch (MasterNotRunningException e) {
} catch (ZooKeeperConnectionException e) {
} catch (IOException e) {
System.out.println("end create table ......");
* 添加行列数据数据
* @param tableName
* @throws Exception
public static void insertData(String tableName, String rowId, String familyName,String qualifier, String value) throws Exception {
System.out.println("start insert data ......");
Table table = connection.getTable(TableName.valueOf(tableName));
Put put = new Put(rowId.getBytes());// 一个PUT代表一行数据,再NEW一个PUT表示第二行数据,每行一个唯一的ROWKEY,此处rowkey为put构造方法中传入的值
put.addColumn(familyName.getBytes(), qualifier.getBytes(), value.getBytes());// 本行数据的第一列
try {
} catch (IOException e) {
System.out.println("end insert data ......");
* 删除行
* @param tablename
* @param rowkey
public static void deleteRow(String tablename, String rowkey) {
try {
Table table = connection.getTable(TableName.valueOf(tablename));
Delete d1 = new Delete(rowkey.getBytes());
table.delete(d1);//d1.addColumn(family, qualifier);d1.addFamily(family);
} catch (IOException e) {
* 查询所有数据
* @param tableName
* @throws Exception
public static void queryAll(String tableName) throws Exception {
Table table = connection.getTable(TableName.valueOf(tableName));
try {
ResultScanner rs = table.getScanner(new Scan());
for (Result r : rs) {
System.out.println("获得到rowkey:" + new String(r.getRow()));
for (Cell keyValue : r.rawCells()) {
System.out.println("列:" + new String(CellUtil.cloneFamily(keyValue))+":"+new String(CellUtil.cloneQualifier(keyValue)) + "====值:" + new String(CellUtil.cloneValue(keyValue)));
} catch (IOException e) {
* 根据rowId查询
* @param tableName
* @throws Exception
public static void queryByRowId(String tableName, String rowId) throws Exception {
Table table = connection.getTable(TableName.valueOf(tableName));
try {
Get scan = new Get(rowId.getBytes());// 根据rowkey查询
Result r = table.get(scan);
System.out.println("获得到rowkey:" + new String(r.getRow()));
for (Cell keyValue : r.rawCells()) {
System.out.println("列:" + new String(CellUtil.cloneFamily(keyValue))+":"+new String(CellUtil.cloneQualifier(keyValue)) + "====值:" + new String(CellUtil.cloneValue(keyValue)));
} catch (IOException e) {
* 根据列条件查询
* @param tableName
public static void queryByCondition(String tableName, String familyName,String qualifier,String value) {
try {
Table table = connection.getTable(TableName.valueOf(tableName));
Filter filter = new SingleColumnValueFilter(Bytes.toBytes(familyName), Bytes.toBytes(qualifier), CompareOp.EQUAL, Bytes.toBytes(value)); // 当列familyName的值为value时进行查询
Scan s = new Scan();
ResultScanner rs = table.getScanner(s);
for (Result r : rs) {
System.out.println("获得到rowkey:" + new String(r.getRow()));
for (Cell keyValue : r.rawCells()) {
System.out.println("列:" + new String(CellUtil.cloneFamily(keyValue))+":"+new String(CellUtil.cloneQualifier(keyValue)) + "====值:" + new String(CellUtil.cloneValue(keyValue)));
} catch (Exception e) {
* 多条件查询
* @param tableName
public static void queryByConditions(String tableName, String[] familyNames, String[] qualifiers,String[] values) {
try {
Table table = connection.getTable(TableName.valueOf(tableName));
List<Filter> filters = new ArrayList<Filter>();
if (familyNames != null && familyNames.length > 0) {
int i = 0;
for (String familyName : familyNames) {
Filter filter = new SingleColumnValueFilter(Bytes.toBytes(familyName), Bytes.toBytes(qualifiers[i]), CompareOp.EQUAL, Bytes.toBytes(values[i]));
FilterList filterList = new FilterList(filters);
Scan scan = new Scan();
ResultScanner rs = table.getScanner(scan);
for (Result r : rs) {
System.out.println("获得到rowkey:" + new String(r.getRow()));
for (Cell keyValue : r.rawCells()) {
System.out.println("列:" + new String(CellUtil.cloneFamily(keyValue))+":"+new String(CellUtil.cloneQualifier(keyValue)) + "====值:" + new String(CellUtil.cloneValue(keyValue)));
} catch (Exception e) {
* 删除表
* @param tableName
public static void dropTable(String tableStr) {
try {
Admin admin = connection.getAdmin();
TableName tableName = TableName.valueOf(tableStr);
} catch (MasterNotRunningException e) {
} catch (ZooKeeperConnectionException e) {
} catch (IOException e) {
public static void main(String[] args) throws Exception {
createTable("t_table", new String[]{"f1","f2","f3"});
insertData("t_table", "row-0001", "f1","a", "fffaaa");
insertData("t_table", "row-0001", "f2", "b","fffbbb");
insertData("t_table", "row-0001", "f3", "c","fffccc");
insertData("t_table", "row-0002", "f1", "a","eeeeee");
queryByRowId("t_table", "row-0001");
queryByCondition("t_table", "f1","a", "eeeeee");
queryByConditions("t_table", new String[]{"f1","f3"},new String[]{"a","c"}, new String[]{"fffaaa","fffccc"});
deleteRow("t_table", "row-0001");
在eclipse中执行结果(直接使用 run as ->java application):
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
start create table ......
end create table ......
start insert data ......
end insert data ......
start insert data ......
end insert data ......
start insert data ......
end insert data ......
start insert data ......
end insert data ......