一、compact
手动compact
rocksdb提供DB::CompactRange或者DB::CompactFiles,可以手动触发压缩。pika在nemo/nemo_admin.cc:Status Nemo::Compact(DBType type, const std::string &key, bool sync)中封装DB::CompactRange触发手动压缩。
在以下两种情况下均会触发Compact:
- 定时compact
通过配置compact-cron/compact-interval,触发定时AutoCompactRange。定时进行时手动压缩,这种手动压缩没有start/end key,会对所有sst文件进行compact。后台线程处理。 - compact命令
通过执行compact db命令,触发手动压缩,rocksdb行为同1。后台线程处理。 - list、hash、set、zset中涉及到相关会删除field的操作(同一个DB中累积超过1000,就会触发,后台线程处理。)。
比如hdel、lpop、SINTERSTORE等等,具体原因参见 秒删大量的key
rocksdb相关日志(会标注Manual):
2593 2020/03/12-19:55:22.125484 7fb0f5682700 [default] New memtable created with log file: #267. Immutable memtables: 0.
2594 2020/03/12-19:55:22.125584 7fb0f6684700 [JOB 287] Syncing log #265
2595 2020/03/12-19:55:22.125728 7fb0f6684700 (Original Log Time 2020/03/12-19:55:22.125562) Calling FlushMemTableToOutputFile with column family [defau lt], flush slots available 2, compaction slots allowed 1, compaction slots scheduled 1
2596 2020/03/12-19:55:22.125742 7fb0f6684700 [default] [JOB 287] Flushing memtable with next log file: 267
2597 2020/03/12-19:55:22.125777 7fb0f6684700 EVENT_LOG_v1 {"time_micros": 1584014122125763, "job": 287, "event": "flush_started", "num_memtables": 1, " num_entries": 1007551, "num_deletes": 0, "memory_usage": 454070736}
2598 2020/03/12-19:55:22.125783 7fb0f6684700 [default] [JOB 287] Level-0 flush table #268: started
2599 2020/03/12-19:55:23.141038 7fb0f6684700 EVENT_LOG_v1 {"time_micros": 1584014123140998, "cf_name": "default", "job": 287, "event": "table_file_crea tion", "file_number": 268, "file_size": 34741512, "table_properties": {"data_size": 31417025, "index_size": 4122204, "filter_size": 1338842, "raw_ key_size": 24181224, "raw_average_key_size": 24, "raw_value_size": 411080808, "raw_average_value_size": 408, "num_data_blocks": 111951, "num_entri es": 1007551, "filter_policy_name": "rocksdb.BuiltinBloomFilter", "kDeletedKeys": "0", "kMergeOperands": "0"}}
2600 2020/03/12-19:55:23.141060 7fb0f6684700 [default] [JOB 287] Level-0 flush table #268: 34741512 bytes OK
2601 2020/03/12-19:55:23.141710 7fb0f6684700 (Original Log Time 2020/03/12-19:55:23.141071) [default] Level-0 commit table #268 started
2602 2020/03/12-19:55:23.141720 7fb0f6684700 (Original Log Time 2020/03/12-19:55:23.141455) [default] Level-0 commit table #268: memtable #1 done
2603 2020/03/12-19:55:23.141724 7fb0f6684700 (Original Log Time 2020/03/12-19:55:23.141466) EVENT_LOG_v1 {"time_micros": 1584014123141461, "job": 287, "event": "flush_finished", "lsm_state": [4, 12, 75, 12, 0, 0, 0], "immutable_memtables": 0}
2604 2020/03/12-19:55:23.141728 7fb0f6684700 (Original Log Time 2020/03/12-19:55:23.141512) [default] Level summary: base level 1 max bytes base 268435 456 files[4 12 75 12 0 0 0] max score 1.00
这个log显示每层文件个数
2605 2020/03/12-19:55:23.141824 7fb0f6684700 [JOB 287] Try to delete WAL files size 443322440, prev total WAL file size 443322440, number of live WAL f iles 2.
2638 2020/03/12-19:55:23.265450 7fb0f5682700 [default] Manual compaction starting[标识手动compact]
2639 2020/03/12-19:55:24.154029 7fb0f7e87700 [default] [JOB 288] Generated table #269: 672589 keys, 23191015 bytes
2640 2020/03/12-19:55:24.154092 7fb0f7e87700 EVENT_LOG_v1 {"time_micros": 1584014124154058, "cf_name": "default", "job": 288, "event": "table_file_creation", "file_number": 269, "file_size": 23191015, "table_properties": {"data_size": 209 71672, "index_size": 2749277, "filter_size": 893750, "raw_key_size": 16142136, "raw_average_key_size": 24, "raw_value_size": 274416312, "raw_average_value_size": 408, "num_data_blocks": 74733, "num_entries": 672589, "filter_policy_na me": "rocksdb.BuiltinBloomFilter", "kDeletedKeys": "0", "kMergeOperands": "0"}}
2641 2020/03/12-19:55:25.245905 7fb0f7e87700 [default] [JOB 288] Generated table #270: 638203 keys, 22005331 bytes
2642 2020/03/12-19:55:25.245985 7fb0f7e87700 EVENT_LOG_v1 {"time_micros": 1584014125245954, "cf_name": "default", "job": 288, "event": "table_file_creation", "file_number": 270, "file_size": 22005331, "table_properties": {"data_size": 198 99394, "index_size": 2608325, "filter_size": 848049, "raw_key_size": 15316872, "raw_average_key_size": 24, "raw_value_size": 260386824, "raw_average_value_size": 408, "num_data_blocks": 70912, "num_entries": 638203, "filter_policy_na me": "rocksdb.BuiltinBloomFilter", "kDeletedKeys": "0", "kMergeOperands": "0"}}
2643 2020/03/12-19:55:25.246073 7fb0f7e87700 [default] [JOB 288] Compacted 1@0 + 2@1 files to L1 => 45196346 bytes
2644 2020/03/12-19:55:25.246657 7fb0f7e87700 (Original Log Time 2020/03/12-19:55:25.246523) [default] compacted to: base level 1 max bytes base 268435456 files[3 12 75 12 0 0 0] max score 0.99, MB/sec: 33.6 rd, 21.5 wr, level 1, files in( 1, 2) out(2) MB in(37.9, 29.5) out(43.1), read-write-amplify(2.9) write-amplify(1.1) OK, records in: 2051981, records dropped: 741189
这个log除了显示每层文件个数,还显示了读写byte数,写byte总数就是输出文件总大小,读写放大倍数,写放大倍数=写byte总数/非输出level的byte总数,读写放大倍数=写总数+输入byte总数/非输出level的byte总数,compaction输入key个数,compaction过程中丢弃key个数
自动compact
rocksdb的compact策略是在写放大、读放大、空间放大的权衡。默认手动compact策略为:
a) 当手动compact执行时,会等待所有的自动compact任务结束, 然后才会执行本次手动compact;
b) 手动执行期间,自动compact无法执行
当手动compact执行很长时间,无法执行自动compact,会导致线上新的写请求只能在memtable中。当memtable个数超过设置的level0_slowdown_writes_trigger(默认20),写请求会出被sleep。再严重一些,当超过level0_stop_writes_trigger(默认24),完全停写。
为了避免这种情况,对compact的策略进行调整,使得自动compact一直优先执行,避免停写;
nemo-rocksdb
rocksdb::CompactRangeOptions ops;
ops.exclusive_manual_compaction = false;
//the bottommost level files will be compacted
ops.bottommost_level_compaction = rocksdb::BottommostLevelCompaction::kForce;
rocksdb有三种compact style:
- kCompactionStyleLevel:rocksdb默认的compact style。这种策略保证每层的key都是有序并且互不重叠,但层于层直接的key可能会重叠。当Ln层满足compact条件时,会根据CompactionPri策略,挑选出Ln层的文件file1和Ln+1层中与file1有key重叠的file进行合并,生成file2,并将file2插入到Ln+1层中。该策略拥有最小的空间放大,但是带来很大的写放大和读放大。
- kCompactionStyleUniversal:这种style将磁盘上数据完全按照写入时间线去组织,只将相邻时间段内写入磁盘上的数据compact,很像在合并互不重叠(overlapping)的时间段。该策略写放大比较低,但是读放大和空间放大比较高。
- kCompactionStyleFIFO:通常用于几乎不修改的场景,比如消息队列、时序数据库等。
- kCompactionStyleNone:关闭自动compact,手动触发执行。
pika采用默认的方式,即kCompactionStyleLevel,也就是拥有最小的空间放大,但是会带来很大的读写放大。
自动compact其他策略与详细资料参见:https://rocksdb.org.cn/doc/Compaction.html
rocksdb相关日志:
189 2020/03/04-01:23:23.936058 7fb0f0e79700 [default] New memtable created with log file: #9. Immutable memtables: 0.
190 2020/03/04-01:23:23.936217 7fb0f6684700 [JOB 2] Syncing log #3
191 2020/03/04-01:23:28.081903 7fb0f6684700 (Original Log Time 2020/03/04-01:23:23.936206) Calling FlushMemTableToOutputFile with column family [default], flush slots available 2, compaction slots allowed 1, compaction slots scheduled 1
192 2020/03/04-01:23:28.081923 7fb0f6684700 [default] [JOB 2] Flushing memtable with next log file: 9
193 2020/03/04-01:23:28.081995 7fb0f6684700 EVENT_LOG_v1 {"time_micros": 1583256208081980, "job": 2, "event": "flush_started", "num_memtables": 1, "num_entries": 1154043, "num_deletes": 0, "memory_usage": 520096264}
194 2020/03/04-01:23:28.082004 7fb0f6684700 [default] [JOB 2] Level-0 flush table #10: started
195 2020/03/04-01:23:31.370890 7fb0f6684700 EVENT_LOG_v1 {"time_micros": 1583256211370847, "cf_name": "default", "job": 2, "event": "table_file_creati on", "file_number": 10, "file_size": 39787407, "table_properties": {"data_size": 35980258, "index_size": 4722592, "filter_size": 1533484, "raw_key _size": 27697032, "raw_average_key_size": 24, "raw_value_size": 470849544, "raw_average_value_size": 408, "num_data_blocks": 128227, "num_entries" : 1154043, "filter_policy_name": "rocksdb.BuiltinBloomFilter", "kDeletedKeys": "0", "kMergeOperands": "0"}}
196 2020/03/04-01:23:31.370915 7fb0f6684700 [default] [JOB 2] Level-0 flush table #10: 39787407 bytes OK
197 2020/03/04-01:23:31.607721 7fb0f6684700 (Original Log Time 2020/03/04-01:23:31.370927) [default] Level-0 commit table #10 started
198 2020/03/04-01:23:31.607729 7fb0f6684700 (Original Log Time 2020/03/04-01:23:31.607617) [default] Level-0 commit table #10: memtable #1 done
199 2020/03/04-01:23:31.607740 7fb0f6684700 (Original Log Time 2020/03/04-01:23:31.607649) EVENT_LOG_v1 {"time_micros": 1583256211607638, "job": 2, "e vent": "flush_finished", "lsm_state": [1, 0, 0, 0, 0, 0, 0], "immutable_memtables": 0}
200 2020/03/04-01:23:31.607745 7fb0f6684700 (Original Log Time 2020/03/04-01:23:31.607670) [default] Level summary: base level 1 max bytes base 268435 456 files[1 0 0 0 0 0 0] max score 0.25
201 2020/03/04-01:23:31.607754 7fb0f6684700 [JOB 2] Try to delete WAL files size 507778920, prev total WAL file size 547313800, number of live WAL files 2.
475 Interval stall: 00:00:0.000 H:M:S, 0.0 percent
476 2020/03/12-11:22:06.064440 7fb0f8688700 [default] [JOB 27] Compacting 4@0 + 6@1 files to L1, score 1.00
477 2020/03/12-11:22:06.064451 7fb0f8688700 [default] Compaction start summary: Base version 27 Base level 0, inputs: [34(37MB) 32(37MB) 12(37MB) 10(37MB)], [16(37MB) 18(37MB) 20(37MB) 24(37MB) 28(37MB) 30(37MB)]
478 2020/03/12-11:22:06.064479 7fb0f8688700 EVENT_LOG_v1 {"time_micros": 1583983326064464, "job": 27, "event": "compaction_started", "files_L0": [34, 32, 12, 10], "files_L1": [16, 18, 20, 24, 28, 30], "score": 1, "input_data_size": 39792 6803}
479 2020/03/12-11:22:07.448832 7fb0f8688700 [default] [JOB 27] Generated table #35: 672544 keys, 23190962 bytes
480 2020/03/12-11:22:07.448876 7fb0f8688700 EVENT_LOG_v1 {"time_micros": 1583983327448858, "cf_name": "default", "job": 27, "event": "table_file_creation", "file_number": 35, "file_size": 23190962, "table_properties": {"data_size": 20971 640, "index_size": 2749097, "filter_size": 893695, "raw_key_size": 16141056, "raw_average_key_size": 24, "raw_value_size": 274397952, "raw_average_value_size": 408, "num_data_blocks": 74728, "num_entries": 672544, "filter_policy_name ": "rocksdb.BuiltinBloomFilter", "kDeletedKeys": "0", "kMergeOperands": "0"}}
手动compact和自动compact的统计区别
第M次自从compact前的统计
2174 Level Files Size(MB} Score Read(GB} Rn(GB} Rnp1(GB} Write(GB} Wnew(GB} Moved(GB} W-Amp Rd(MB/s} Wr(MB/s} Comp(sec} Comp(cnt} Avg(sec} KeyIn KeyDrop
2175 ----------------------------------------------------------------------------------------------------------------------------------------------------------
2176 L0 4/0 151.79 1.0 0.0 0.0 0.0 3.0 3.0 0.0 0.0 0.0 26.3 117 81 1.443 0 0
2177 L1 12/0 221.16 0.9 1.1 0.6 0.6 0.9 0.3 2.3 1.6 23.0 18.4 51 12 4.240 35M 7054K
2178 L2 69/0 2305.55 0.9 0.4 0.2 0.2 0.2 0.1 2.2 1.1 22.4 13.6 18 4 4.604 12M 4428K
2179 Sum 85/0 2678.50 0.0 1.5 0.8 0.8 4.2 3.4 4.5 1.4 8.5 22.9 186 97 1.919 48M 11M
2180 Int 0/0 0.00 0.0 0.0 0.0 0.0 0.3 0.3 0.7 1.0 0.0 26.1 13 9 1.453 0 0
M次自动compact后的统计
2381 ** Compaction Stats [default] **
2382 Level Files Size(MB} Score Read(GB} Rn(GB} Rnp1(GB} Write(GB} Wnew(GB} Moved(GB} W-Amp Rd(MB/s} Wr(MB/s} Comp(sec} Comp(cnt} Avg(sec} KeyIn KeyDrop
2383 ----------------------------------------------------------------------------------------------------------------------------------------------------------
2384 L0 4/0 151.79 1.0 0.0 0.0 0.0 3.3 3.3 0.0 0.0 0.0 26.1 131 90 1.452 0 0
2385 L1 12/0 221.16 0.9 1.1 0.6 0.6 0.9 0.3 2.6 1.6 23.0 18.4 51 12 4.240 35M 7054K
2386 L2 75/0 2533.24 1.0 0.4 0.2 0.2 0.2 0.1 2.5 1.1 22.4 13.6 18 4 4.604 12M 4428K
2387 L3 3/0 113.85 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0 0 0.000 0 0
2388 Sum 94/0 3020.04 0.0 1.5 0.8 0.8 4.5 3.7 5.3 1.3 7.9 23.0 200 106 1.887 48M 11M
2389 Int 0/0 0.00 0.0 0.0 0.0 0.0 0.3 0.3 0.8 1.0 0.0 24.7 14 9 1.533 0 0
第一次手动compact(不设range)时的统计
2608 ** Compaction Stats [default] **
2609 Level Files Size(MB} Score Read(GB} Rn(GB} Rnp1(GB} Write(GB} Wnew(GB} Moved(GB} W-Amp Rd(MB/s} Wr(MB/s} Comp(sec} Comp(cnt} Avg(sec} KeyIn KeyDrop
2610 ----------------------------------------------------------------------------------------------------------------------------------------------------------
2611 L0 4/0 146.97 1.0 0.0 0.0 0.0 3.7 3.7 0.0 0.0 0.0 26.2 143 99 1.444 0 0
2612 L1 12/0 221.16 0.9 1.1 0.6 0.6 0.9 0.3 3.0 1.6 23.0 18.4 51 12 4.240 35M 7054K
2613 L2 75/0 2533.23 1.0 0.4 0.2 0.2 0.2 0.1 2.9 1.1 22.4 13.6 18 4 4.604 12M 4428K
2614 L3 12/0 455.40 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.0 0.0 0.0 0 0 0.000 0 0
2615 Sum 103/0 3356.76 0.0 1.5 0.8 0.8 4.8 4.1 6.3 1.3 7.5 23.3 212 115 1.846 48M 11M
2616 Int 0/0 0.00 0.0 0.0 0.0 0.0 0.3 0.3 1.0 1.0 0.0 27.3 12 9 1.370 0 0
手动compact后的统计
** Compaction Stats [default] **
Level Files Size(MB} Score Read(GB} Rn(GB} Rnp1(GB} Write(GB} Wnew(GB} Moved(GB} W-Amp Rd(MB/s} Wr(MB/s} Comp(sec} Comp(cnt} Avg(sec} KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
L3 141/23 3050.66 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0.000 0 0
Sum 141/23 3050.66 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0.000 0 0
Int 0/0 0.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0.000 0 0
可见pika的手动compact(不设range)是对全部level层的compact,与Score的结果无关,结果将数据全部压缩到最后一层。而自动则大部分是一种局部动作,只有Score>1.0才会进行compact。
二、bgsave
[bai.xin@bjdx-platform-storage003.test.bjdx.momo.com ~]$ redis-cli -p 6379
127.0.0.1:9559> bgsave
20200313124701 : 1551: 37924233
Pika快照式备份方案以及流程:bookstack.cn/read/Pika-zh/23.md
bgsave相关日志如下:
pika日志:
I0313 12:45:51.038194 184140 pika_server.cc:1596] Delete dir: /tmp/pika9559/db/zset_deleting start
I0313 12:45:51.119170 184140 pika_server.cc:1598] Delete dir: /tmp/pika9559/db/zset_deleting done
I0313 12:47:01.013185 74484 pika_server.cc:1000] after prepare bgsave
I0313 12:47:01.013231 74484 pika_server.cc:1003] bgsave_info: path=/tmp/pika9559/dump/20200313, filenum=1551, offset=37924233
I0313 12:47:01.017238 74484 pika_server.cc:1009] Create new backup finished.
rocksdb日志:
2020/03/13-12:47:01.008921 7fb0e4a61700 File Deletions Disabled
2020/03/13-12:47:01.013393 7fb0e3a5f700 Started the snapshot process -- creating snapshot in directory /tmp/pika9559/dump/20200313/kv
2020/03/13-12:47:01.013469 7fb0e3a5f700 Hard Linking /000545.sst
2020/03/13-12:47:01.013506 7fb0e3a5f700 Hard Linking /000546.sst
2020/03/13-12:47:01.013524 7fb0e3a5f700 Hard Linking /000547.sst
2020/03/13-12:47:01.013544 7fb0e3a5f700 Hard Linking /000548.sst
2020/03/13-12:47:01.013564 7fb0e3a5f700 Hard Linking /000549.sst
2020/03/13-12:47:01.013582 7fb0e3a5f700 Hard Linking /000550.sst
2020/03/13-12:47:01.013600 7fb0e3a5f700 Hard Linking /000551.sst
2020/03/13-12:47:01.013617 7fb0e3a5f700 Hard Linking /000552.sst
2020/03/13-12:47:01.013653 7fb0e3a5f700 Hard Linking /000553.sst
2020/03/13-12:47:01.013673 7fb0e3a5f700 Hard Linking /000554.sst
2020/03/13-12:47:01.016101 7fb0e3a5f700 Hard Linking /000683.sst
2020/03/13-12:47:01.016115 7fb0e3a5f700 Hard Linking /000684.sst
2020/03/13-12:47:01.016128 7fb0e3a5f700 Hard Linking /000685.sst
…...
2020/03/13-12:47:01.016157 7fb0e3a5f700 Copying /MANIFEST-000687
2020/03/13-12:47:01.016253 7fb0e3a5f700 Copying /OPTIONS-000692
2020/03/13-12:47:01.016354 7fb0e3a5f700 Number of log files 0
2020/03/13-12:47:01.016358 7fb0e3a5f700 File Deletions Enabled
生成dump/备份日期目录:
[root@bjdx-platform-storage003.test.bjdx.momo.com kv]# pwd
/tmp/pika9559/dump/20200313/kv
-rw-r--r-- 2 root root 23M 3月 12 20:02 000545.sst
......
-rw-r--r-- 2 root root 23M 3月 12 20:02 000682.sst
-rw-r--r-- 2 root root 23M 3月 12 20:02 000683.sst
-rw-r--r-- 2 root root 23M 3月 12 20:02 000684.sst
-rw-r--r-- 2 root root 20M 3月 12 20:02 000685.sst
只有以下文件是拷贝的,sst文件都是硬链接,保证备份效率。
-rw-r--r-- 1 root root 4.1K 3月 13 12:47 OPTIONS-000692
-rw-r--r-- 1 root root 8.4K 3月 13 12:47 MANIFEST-000687
-rw-r--r-- 1 root root 16 3月 13 12:47 CURRENT
当机器上实例较多时,因为bgsave本身需要将memtable中的数据flush到磁盘中,以及需要进行一些元信息统计,会花费很多时间(可能长达几十秒,这也是在运维过程中需要注意的。)
三、rocksdb Log解析

Compaction Stats [default]

Compaction stats 就是在 level N 和 N + 1 层做 compaction 的统计,在 N + 1 层输出结果。
* Level:也就是 LSM 的 level,Sum 表示的是所有 level 的总和,而 Int 则表示从上一次 stats 到现在的数据。
* Files:有两个值 a/b,a 用来表示当前 level 有多少文件,而 b 则用来表示当前用多少个线程正在对这层 level 做 compaction。
* Size(MB}:当前 level 总共多大,用 MB 表示。
* Score:除开 level 0,其他几层的 score 都是通过 (current level size) / (max level size) 来计算,通常 0 和 1 是正常的,但如果大于 1 了,表明这层 level 就需要做 compaction 了。对于 level 0,我们通过 (current number of files) / (file number compaction trigger) 来计算。
* Read(GB}:对 level N 和 N + 1 做 compaction 的时候总的读取数据量
* Rn(GB}:对 level N 和 N + 1 做 compaction 的时候从 level N 读取的数据量
* Rnp1(GB}:对 level N 和 N + 1 做 compaction 的时候从 level N + 1读取的数据量
* Write(GB}:对 level N 和 N + 1 做 compaction 的时候总的写入数据量
* Wnew(GB}:新写入 level N + 1 的数据量,使用 (total bytes written to N+1) - (bytes read from N+1 during compaction with level N)
* Moved(GB}:直接移动到 level N + 1 的数据量。这里仅仅只会更新 manifest,并没有其他 IO 操作,表明之前在 level X 的文件现在在 level Y 了。
* W-Amp:从 level N 到 N + 1 的写放大,使用 (total bytes written to level N+1) / (total bytes read from level N) 计算。
* Rd(MB/s}:在 level N 到 N + 1 做 compaction 的时候读取速度。使用 Read(GB) * 1024 / duration 计算,duration 就是从 N 到 N + 1 compaction 的时间。
* Wr(MB/s}:在 level N 到 N + 1 做 compaction 的时候写入速度,类似 Rd(MB/s)。
* Comp(sec}:在 level N 到 N + 1 做 compaction 的总的时间。
* Comp(cnt}:在 level N 到 N + 1 做 compaction 的总的次数。
* Avg(sec}:在 level N 到 N + 1 做 compaction 平均时间。
* KeyIn:在 level N 到 N + 1 做 compaction 的时候比较的 record 的数量
* KeyDrop:在 level N 到 N + 1 做 compaction 的时候直接丢弃的 record 的数量
Compaction Stats其他
Uptime(secs): 727897.9 total, 727897.9 interval
Flush(GB): cumulative 0.778(累积 flush数据量), interval 0.296(间隔 flush数据量)
AddFile(GB): cumulative 0.000(累积fastload导入数据量 ), interval 0.000(间隔fastload导入数据量)
AddFile(Total Files): cumulative 0(累积Fastload导入文件个数), interval 0((间隔Fastload导入文件个数)
AddFile(L0 Files): cumulative 0(累积Fastload导入L0文件个数), interval 0(间隔Fastload导入L0文件个数)
AddFile(Keys): cumulative 0(累积Fastload导入key数), interval 0(间隔Fastload导入key数)
Cumulative compaction: 1.42 GB write(compact写数据量,包括memtable Flush到L0), 0.00 MB/s write, 0.81 GB read(compact读文件数据量,Memtable Flush到L0没有读数据量),0.00 MB/s read, 69.6 seconds
Interval compaction: 0.94 GB write, 0.00 MB/s write, 0.81 GB read, 0.00 MB/s read, 45.7 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_co
DB Stats
Uptime(secs): 727897.9 total(从开始统计到当前时间秒数), 600.1 interval(距离上次统计间隔)
以下3行是累积值
Cumulative writes: 24M writes(写入总batch数), 24M keys(写入key的个数), 24M commit groups(组提交非次数), 1.0 writes per commit group(平均每组提交的batch数), ingest: 9.94 GB(累计写入的数据总量), 0.01 MB/s(平均每秒写入的数据量)
Cumulative WAL: 24M writes(写wal文件次数), 0 syncs(sync方式写入的次数), 24253628.00 writes per sync(每次 sync 有多少次写入), written: 9.94 GB(累积斜体WAL文件数据量), 0.01 MB/s(平均每秒写入WAL文件数据量)
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent(因为stalling原因导致累积写耽搁时间)
Interval writes: 9224K writes, 9224K keys, 9224K commit groups, 1.0 writes per commit group, ingest: 3870.93 MB, 6.45 MB/s
Interval WAL: 9224K writes, 0 syncs, 9224928.00 writes per sync, written: 3.78 MB, 6.45 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent
读性能统计
LSM-tree的劣势就是读, 虽然加了各种花式索引(manifest、block index, table cache, block cache, bloomfile等等等等), 但是仍然避免不了读的延迟过大, 特别是在读高level的数据时。
看其统计日志, 可以看出, 越高层的level, lantcy 越高, 而且分布范围比较大.
level 0
** Level 0 read latency histogram (micros):
Count: 28642373 Average: 1.5582 StdDev: 28.81
Min: 0 Median: 0.8165 Max: 64775
Percentiles: P50: 0.82 P75: 1.45 P99: 4.34 P99.9: 8.60 P99.99: 37.27
------------------------------------------------------
[ 0, 1 ) 17540308 61.239% 61.239% ############
[ 1, 2 ) 8788112 30.682% 91.921% ######
[ 2, 3 ) 1583660 5.529% 97.450% #
level1
** Level 1 read latency histogram (micros):
Count: 37969465 Average: 1.8297 StdDev: 72.73
Min: 0 Median: 0.9116 Max: 72825
Percentiles: P50: 0.91 P75: 1.64 P99: 4.04 P99.9: 10.51 P99.99: 139.41
------------------------------------------------------
[ 0, 1 ) 20825651 54.848% 54.848% ###########
[ 1, 2 ) 11887833 31.309% 86.157% ######
[ 2, 3 ) 3928792 10.347% 96.505% ##
[ 7, 8 ) 15932 0.042% 99.847%
[ 8, 9 ) 9510 0.025% 99.872%
[ 9, 10 ) 7222 0.019% 99.891%
[ 10, 12 ) 13102 0.035% 99.926%
[ 12, 14 ) 7419 0.020% 99.945%
[ 3500, 4000 ) 4 0.000% 99.999%
[ 4000, 4500 ) 4 0.000% 99.999%
[ 4500, 5000 ) 2 0.000% 99.999%
[ 5000, 6000 ) 4 0.000% 99.999%
[ 6000, 7000 ) 6 0.000% 99.999%
[ 30000, 35000 ) 5 0.000% 100.000%
[ 35000, 40000 ) 9 0.000% 100.000%
[ 40000, 45000 ) 8 0.000% 100.000%
[ 45000, 50000 ) 11 0.000% 100.000%
[ 50000, 60000 ) 16 0.000% 100.000%
[ 60000, 70000 ) 7 0.000% 100.000%
[ 70000, 80000 ) 2 0.000% 100.000%
level 2
** Level 2 read latency histogram (micros):
Count: 225317843 Average: 1.4581 StdDev: 15.80
Min: 0 Median: 0.7603 Max: 80588
Percentiles: P50: 0.76 P75: 1.33 P99: 3.25 P99.9: 8.89 P99.99: 62.49
-------------------------------------------------------
[ 12000, 14000 ) 48 0.000% 100.000%
[ 14000, 16000 ) 24 0.000% 100.000%
[ 16000, 18000 ) 30 0.000% 100.000%
[ 18000, 20000 ) 8 0.000% 100.000%
[ 20000, 25000 ) 38 0.000% 100.000%
[ 25000, 30000 ) 10 0.000% 100.000%
[ 30000, 35000 ) 10 0.000% 100.000%
[ 35000, 40000 ) 8 0.000% 100.000%
[ 40000, 45000 ) 4 0.000% 100.000%
[ 50000, 60000 ) 4 0.000% 100.000%
[ 60000, 70000 ) 3 0.000% 100.000%
[ 70000, 80000 ) 3 0.000% 100.000%
[ 80000, 90000 ) 1 0.000% 100.000%
level3
** Level 3 read latency histogram (micros):
Count: 54961965 Average: 1.6417 StdDev: 54.41
Min: 0 Median: 0.6418 Max: 105284
Percentiles: P50: 0.64 P75: 0.96 P99: 2.98 P99.9: 54.35 P99.99: 246.44
------------------------------------------------------
[ 40000, 45000 ) 4 0.000% 100.000%
[ 45000, 50000 ) 1 0.000% 100.000%
[ 50000, 60000 ) 1 0.000% 100.000%
[ 90000, 100000 ) 1 0.000% 100.000%
[ 100000, 120000 ) 2 0.000% 100.000%
level 4
** Level 4 read latency histogram (micros):
Count: 235086 Average: 103.2528 StdDev: 1140.13
Min: 1 Median: 4.0320 Max: 87686
Percentiles: P50: 4.03 P75: 5.94 P99: 444.43 P99.9: 16239.25 P99.99: 54327.60
------------------------------------------------------
[ 60000, 70000 ) 7 0.003% 99.997%
[ 70000, 80000 ) 3 0.001% 99.998%
[ 80000, 90000 ) 5 0.002% 100.000%
网友评论