本节主要内容:
Hbase简单命令及使用
HBase是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非结构化数据存储的数据库。另一个不同的是HBase基于列的而不是基于行的模式。
1.创建表,同时创建列簇
创建了一个表student,表中有一个列簇info
hbase(main):001:0> create 'student','info'
0 row(s) in 5.7830 seconds
=> Hbase::Table - student
2.向列簇中添加数据
hbase(main):002:0> put 'student','row1','info:name','jack'
0 row(s) in 0.3890 seconds
3.查看表信息
hbase(main):003:0> scan 'student'
ROW COLUMN+CELL
row1 column=info:name, timestamp=1593653921966, value=jack
1 row(s) in 0.1670 seconds
4.查询第一行数据列簇info中,name的键值
hbase(main):004:0> get 'student','row1','info:name'
COLUMN CELL
info:name timestamp=1593653921966, value=jack
1 row(s) in 0.0290 seconds
5.继续对row1中的info列插入数据
hbase(main):005:0> put 'student','row1','info:sid','1'
0 row(s) in 0.0200 seconds
hbase(main):006:0> put 'student','row1','info:age','22'
0 row(s) in 0.0310 seconds
hbase(main):007:0> scan 'student'
ROW COLUMN+CELL
row1 column=info:age, timestamp=1593654003632, value=22
row1 column=info:name, timestamp=1593653921966, value=jack
row1 column=info:sid, timestamp=1593653993431, value=1
1 row(s) in 0.0360 seconds
scan会显示所有表信息,所以应该使用
hbase(main):008:0> get 'student','row1'
COLUMN CELL
info:age timestamp=1593654003632, value=22
info:name timestamp=1593653921966, value=jack
info:sid timestamp=1593653993431, value=1
3 row(s) in 0.0290 seconds
6.添加其他信息
hbase(main):001:0> put 'student','row2','info:name','jack2'
0 row(s) in 0.6810 seconds
hbase(main):002:0> put 'student','row2','info:sid','12'
0 row(s) in 0.0270 seconds
hbase(main):003:0> put 'student','row2','info:age','222'
0 row(s) in 0.0240 seconds
hbase(main):004:0> put 'student','row3','info:name','jack3'
0 row(s) in 0.0450 seconds
hbase(main):005:0> put 'student','row3','info:sid','13'
0 row(s) in 0.0430 seconds
hbase(main):006:0> put 'student','row3','info:age','223'
0 row(s) in 0.0210 seconds
hbase(main):007:0> scan 'student'
ROW COLUMN+CELL
row1 column=info:age, timestamp=1593654003632, value=22
row1 column=info:name, timestamp=1593653921966, value=jack
row1 column=info:sid, timestamp=1593653993431, value=1
row2 column=info:age, timestamp=1593654251855, value=222
row2 column=info:name, timestamp=1593654242279, value=jack2
row2 column=info:sid, timestamp=1593654247011, value=12
row3 column=info:age, timestamp=1593654264658, value=223
row3 column=info:name, timestamp=1593654255689, value=jack3
row3 column=info:sid, timestamp=1593654260608, value=13
3 row(s) in 0.1420 seconds
7.scan的使用
只显示两行的数据
hbase(main):008:0> scan 'student',{'LIMIT'=>2}
ROW COLUMN+CELL
row1 column=info:age, timestamp=1593654003632, value=22
row1 column=info:name, timestamp=1593653921966, value=jack
row1 column=info:sid, timestamp=1593653993431, value=1
row2 column=info:age, timestamp=1593654251855, value=222
row2 column=info:name, timestamp=1593654242279, value=jack2
row2 column=info:sid, timestamp=1593654247011, value=12
2 row(s) in 0.0390 seconds
从第二行开始显示,一直到结束
hbase(main):009:0> scan 'student',{'STARTROW'=>'row2'}
ROW COLUMN+CELL
row2 column=info:age, timestamp=1593654251855, value=222
row2 column=info:name, timestamp=1593654242279, value=jack2
row2 column=info:sid, timestamp=1593654247011, value=12
row3 column=info:age, timestamp=1593654264658, value=223
row3 column=info:name, timestamp=1593654255689, value=jack3
row3 column=info:sid, timestamp=1593654260608, value=13
2 row(s) in 0.0440 seconds
可以搭配通配符使用
hbase(main):010:0> scan 'student',{'STARTROW'=>'row*'}
ROW COLUMN+CELL
row1 column=info:age, timestamp=1593654003632, value=22
row1 column=info:name, timestamp=1593653921966, value=jack
row1 column=info:sid, timestamp=1593653993431, value=1
row2 column=info:age, timestamp=1593654251855, value=222
row2 column=info:name, timestamp=1593654242279, value=jack2
row2 column=info:sid, timestamp=1593654247011, value=12
row3 column=info:age, timestamp=1593654264658, value=223
row3 column=info:name, timestamp=1593654255689, value=jack3
row3 column=info:sid, timestamp=1593654260608, value=13
3 row(s) in 0.0620 seconds
可以多个条件一起使用
hbase(main):011:0> scan 'student',{'STARTROW'=>'row2','LIMIT'=>1}
ROW COLUMN+CELL
row2 column=info:age, timestamp=1593654251855, value=222
row2 column=info:name, timestamp=1593654242279, value=jack2
row2 column=info:sid, timestamp=1593654247011, value=12
1 row(s) in 0.0300 seconds
只看固定的列
hbase(main):012:0> scan 'student',{'COLUMNS'=>'info:name','LIMIT'=>1}
ROW COLUMN+CELL
row1 column=info:name, timestamp=1593653921966, value=jack
1 row(s) in 0.0290 seconds
可以按时间查看
hbase(main):014:0> scan 'student',{'TIMESTAMP'=>1593653921966}
ROW COLUMN+CELL
row1 column=info:name, timestamp=1593653921966, value=jack
1 row(s) in 0.0250 seconds
8.添加簇列
hbase(main):015:0> alter 'student',{NAME =>'major',VERSION =>2}
Unknown argument ignored for column family major: 1.8.7
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 4.7770 seconds
9.describe
查看表的信息
hbase(main):016:0> describe 'student'
Table student is ENABLED
student
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KE
EP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', CO
MPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65
536', REPLICATION_SCOPE => '0'}
{NAME => 'major', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', K
EEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', C
OMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '6
5536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.0940 seconds
10.删除簇列
hbase(main):022:0> alter 'student', {NAME=>'major',METHOD=>'delete'}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 3.1150 seconds
hbase(main):023:0> describe 'student'
Table student is ENABLED
student
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KE
EP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', CO
MPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65
536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0730 seconds
11.设置列数据有效时间
20秒后数据就清除
hbase(main):024:0> alter 'student',{NAME =>'major',TTL =>20}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.3490 seconds
12.count统计表中的数据行数
hbase(main):025:0> count 'student'
3 row(s) in 0.1820 seconds
=> 3
13.列出服务器中有哪些表
hbase(main):026:0> list
TABLE
student
1 row(s) in 0.0800 seconds
=> ["student"]
14.查看服务器状态
hbase(main):027:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 3.0000 average load
15.查看版本号
hbase(main):028:0> version
1.2.0-cdh5.16.2, rUnknown, Mon Jun 3 03:50:03 PDT 2019
16.查看当前的用户
hbase(main):029:0> whoami
root (auth:SIMPLE)
groups: root
17.truncate
停表,删除数据,重新构建表结构
hbase(main):016:0> scan 'student'
ROW COLUMN+CELL
0 row(s) in 0.0730 seconds
hbase(main):017:0> describe 'student'
Table student is ENABLED
student
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESS
ION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => '
false', BLOCKCACHE => 'true'}
1 row(s) in 0.0480 seconds
hbase(main):039:0> truncate'student'
Truncating 'student' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 8.2130 seconds
18.退出
hbase(main):042:0> quit
网友评论