一、HBase PE 参数介绍
PerformanceEvaluation,这里简称PE,全名为org.apache.hadoop.hbase.PerformanceEvaluation,是HBase自带的性能测试工具,目前主要支持HBase随机/顺序读写延时的性能测试。执行 bin/hbase pe 可直接使用:
[root@xxx ~]$ hbase pe
Usage: java org.apache.hadoop.hbase.PerformanceEvaluation \
<OPTIONS> [-D<property=value>]* <command> <nclients>
nomapred Run multiple clients using threads (rather than use mapreduce)
rows Rows each client runs. Default: 1048576
size Total size in GiB. Mutually exclusive with --rows. Default: 1.0.
sampleRate Execute test on a sample of total rows. Only supported by randomRead. Default: 1.0
traceRate Enable HTrace spans. Initiate tracing every N rows. Default: 0
table Alternate table name. Default: 'TestTable'
multiGet If >0, when doing RandomRead, perform multiple gets instead of single gets. Default: 0
compress Compression type to use (GZ, LZO, ...). Default: 'NONE'
flushCommits Used to determine if the test should flush the table. Default: false
writeToWAL Set writeToWAL on puts. Default: True
autoFlush Set autoFlush on htable. Default: False
oneCon all the threads share the same connection. Default: False
presplit Create presplit table. If a table with same name exists, it'll be deleted and recreated (instead of verifying count of its existing regions). Recommended for accurate perf analysis (see guide). Default: disabled
inmemory Tries to keep the HFiles of the CF inmemory as far as possible. Not guaranteed that reads are always served from memory. Default: false
usetags Writes tags along with KVs. Use with HFile V3. Default: false
numoftags Specify the no of tags that would be needed. This works only if usetags is true. Default: 1
filterAll Helps to filter out all the rows on the server side there by not returning any thing back to the client. Helps to check the server side performance. Uses FilterAllFilter internally.
latency Set to report operation latencies. Default: False
bloomFilter Bloom filter type, one of [NONE, ROW, ROWCOL]
blockEncoding Block encoding to use. Value should be one of [NONE, PREFIX, DIFF, FAST_DIFF, PREFIX_TREE]. Default: NONE
valueSize Pass value size to use: Default: 1000
valueRandom Set if we should vary value size between 0 and 'valueSize'; set on read for stats on size: Default: Not set.
valueZipf Set if we should vary value size between 0 and 'valueSize' in zipf form: Default: Not set.
period Report every 'period' rows: Default: opts.perClientRunRows / 10 = 104857
multiGet Batch gets together into groups of N. Only supported by randomRead. Default: disabled
addColumns Adds columns to scans/gets explicitly. Default: true
replicas Enable region replica testing. Defaults: 1.
splitPolicy Specify a custom RegionSplitPolicy for the table.
randomSleep Do a random sleep before each get between 0 and entered value. Defaults: 0
columns Columns to write per row. Default: 1
caching Scan caching to use. Default: 30
Note: -D properties will be applied to the conf used.
For example:
append Append on each row; clients overlap on keyspace so some concurrent operations
checkAndDelete CheckAndDelete on each row; clients overlap on keyspace so some concurrent operations
checkAndMutate CheckAndMutate on each row; clients overlap on keyspace so some concurrent operations
checkAndPut CheckAndPut on each row; clients overlap on keyspace so some concurrent operations
filterScan Run scan test using a filter to find a specific row based on it's value (make sure to use --rows=20)
increment Increment on each row; clients overlap on keyspace so some concurrent operations
randomRead Run random read test
randomSeekScan Run random seek and scan 100 test
randomWrite Run random write test
scan Run scan test (read every row)
scanRange10 Run random seek scan with both start and stop row (max 10 rows)
scanRange100 Run random seek scan with both start and stop row (max 100 rows)
scanRange1000 Run random seek scan with both start and stop row (max 1000 rows)
scanRange10000 Run random seek scan with both start and stop row (max 10000 rows)
sequentialRead Run sequential read test
sequentialWrite Run sequential write test
nclients Integer. Required. Total number of clients (and HRegionServers) running. 1 <= value <= 500
To run a single client doing the default 1M sequentialWrites:
$ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1
To run 10 clients doing increments over ten rows:
$ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=10 --nomapred increment 10
hbase pe <OPTIONS> [-D<property=value>]* <command> <nclients>
oneCon:是否所有线程使用一个Connection连接,默认false,表示每个线程都会创建一个HBase Connection,这样不合理,建议设置为true,命令中加--oneCon=true即可;
其他参数通常可以默认或根据自己的场景调整,这里不多介绍。此外,command 是PE支持的读写测试类型,包括randomRead,randomWrite,SequentialRead,SequentialWrite等,具体如上。nclients 就是开启的线程数量。
二、HBase 读写压测 P999
当前测试集群包含2个HMaster、8个RS节点,服务器配置:24核CPU;128G内存;1T*6 HDD磁盘,HBase堆大小配置为16G,版本为1.2.0-cdh5.11.0。因此这是一个HBase 1.x的P999性能压测,同样适用于HBase 2.x。
这里分别测试了randomWrite、sequentialWrite,randomRead及sequentialRead的延时情况,给出当前环境下的P99及P999 latency指标供参考。
[root@xxx ~]$ hbase pe --nomapred --oneCon=true --table=rw_test_1 --rows=1000000 --valueSize=100 --compress=SNAPPY --presplit=16 --autoFlush=true randomWrite 16
20/02/22 15:06:07 INFO hbase.PerformanceEvaluation: Latency (us) : mean=186.42, min=0.00, max=594880.00, stdDev=6981.60, 50th=1.00, 75th=2.00, 95th=28.00, 99th=1020.00, 99.9th=3941.00, 99.99th=381319.92, 99.999th=503455.66
20/02/22 15:06:07 INFO hbase.PerformanceEvaluation: Num measures (latency) : 1000000
20/02/22 15:06:07 INFO hbase.PerformanceEvaluation: Mean = 186.42
Min = 0.00
Max = 594880.00
StdDev = 6981.60
50th = 1.00
75th = 2.00
95th = 28.00
99th = 1020.00
99.9th = 3941.00
99.99th = 381319.92
99.999th = 503455.66
[root@xxx ~]$ hbase pe --nomapred --oneCon=true --table=rw_test_2 --size=1 --valueSize=100 --compress=SNAPPY --presplit=16 --autoFlush=true sequentialWrite 16
20/02/22 16:24:49 INFO hbase.PerformanceEvaluation: Latency (us) : mean=220.51, min=0.00, max=1440185.00, stdDev=10022.38, 50th=1.00, 75th=2.00, 95th=132.00, 99th=396.00, 99.9th=1152.00, 99.99th=515707.37, 99.999th=917447.01
20/02/22 16:24:49 INFO hbase.PerformanceEvaluation: Num measures (latency) : 1048576
20/02/22 16:24:49 INFO hbase.PerformanceEvaluation: Mean = 220.51
Min = 0.00
Max = 1440185.00
StdDev = 10022.38
50th = 1.00
75th = 2.00
95th = 132.00
99th = 396.00
99.9th = 1152.00
99.99th = 515707.37
99.999th = 917447.01
[root@xxx ~]$ hbase pe --nomapred --oneCon=true --table=rw_test_2 --size=1 --valueSize=100 randomRead 16
20/02/22 16:53:48 INFO hbase.PerformanceEvaluation: Latency (us) : mean=748.70, min=74.00, max=2161876.00, stdDev=5055.01, 50th=289.00, 75th=364.00, 95th=2665.00, 99th=4579.00, 99.9th=78024.00, 99.99th=100495.98, 99.999th=150378.50
20/02/22 16:53:48 INFO hbase.PerformanceEvaluation: Num measures (latency) : 1048576
20/02/22 16:53:48 INFO hbase.PerformanceEvaluation: Mean = 748.70
Min = 74.00
Max = 2161876.00
StdDev = 5055.01
50th = 289.00
75th = 364.00
95th = 2665.00
99th = 4579.00
99.9th = 78024.00
99.99th = 100495.98
99.999th = 150378.50
[root@xxx ~]$ hbase pe --nomapred --oneCon=true --table=rw_test_2 --size=1 --valueSize=100 sequentialRead 16
20/02/22 17:08:41 INFO hbase.PerformanceEvaluation: Latency (us) : mean=593.44, min=86.00, max=183676.00, stdDev=4299.28, 50th=302.00, 75th=398.00, 95th=633.00, 99th=932.00, 99.9th=75718.98, 99.99th=93035.20, 99.999th=135947.24
20/02/22 17:08:41 INFO hbase.PerformanceEvaluation: Num measures (latency) : 1048576
20/02/22 17:08:42 INFO hbase.PerformanceEvaluation: Mean = 593.44
Min = 86.00
Max = 183676.00
StdDev = 4299.28
50th = 302.00
75th = 398.00
95th = 633.00
99th = 932.00
99.9th = 75718.98
99.99th = 93035.20
99.999th = 135947.24
如果您喜欢这篇文章,点【在看】与转发都是一种鼓励,期待得到您的认可 ❥(^_-)