Universal Compaction 是 RocksDB 支持的另一种 compaction 方式,特点是降低写放大,牺牲读放大和空间放大。使用 Universal Compaction 的 RocksDB 实例,可以看作是在时间上将数据划分到不同的 sorted run,每个 sorted run 在时间上互不交叠。compaction 仅仅针对在时间上相邻的 sorted run 进行,其输出的时间范围是原输入的时间范围的组合。
Compaction Picking
R1, R2, R3, ..., Rn
为了便于说明,以如上的包括 n 个 sorted run 的数据集为例,sorted run 按照数据时间顺序排序。由于 compaction 输出与输入的时间范围相同,compaction 操作不会改变 sorted run 的排序。
Precondition
Compaction 触发前置条件:n >= options.level0_file_num_compaction_trigger,即 sorted run 的数量达到阈值。所有的 compaction 须在满足前置条件的前提下触发。
Compaction Triggered by Space Amplification
size amplification ratio = (size(R1) + size(R2) + ... size(Rn-1)) / size(Rn)
估算 size amplification ratio,超过 options.compaction_options_universal.max_size_amplification_percent / 100 后,所有的文件都会执行 compaction,得到位于最底层的新的 sorted run。
1
1 1 => 2
1 2 => 3
1 3 => 4
1 4
1 1 4 => 6
1 6
1 1 6 => 8
1 8
1 1 8
1 1 1 8 => 11
1 11
1 1 11
1 1 1 11 => 14
1 14
1 1 14
1 1 1 14
1 1 1 1 14 => 18
以 options.compaction_options_universal.max_size_amplification_percent = 25,options.level0_file_num_compaction_trigger = 1 的配置参数为例,compaction 过程如上所示。
Compaction* UniversalCompactionBuilder::PickCompactionToReduceSizeAmp() {
// percentage flexibility while reducing size amplification
uint64_t ratio = mutable_cf_options_.compaction_options_universal
.max_size_amplification_percent;
...
// keep adding up all the remaining files
for (size_t loop = start_index; loop + 1 < sorted_runs_.size(); loop++) {
sr = &sorted_runs_[loop];
if (sr->being_compacted) {
char file_num_buf[kFormatFileNumberBufSize];
sr->Dump(file_num_buf, sizeof(file_num_buf), true);
ROCKS_LOG_BUFFER(
log_buffer_, "[%s] Universal: Possible candidate %s[%d] %s",
cf_name_.c_str(), file_num_buf, start_index,
" is already being compacted. No size amp reduction possible.\n");
return nullptr;
}
// 计算活跃数据集大小(除最大的一个 sorted run 之外的数据集大小)
candidate_size += sr->compensated_file_size;
candidate_count++;
}
// size of earliest file
uint64_t earliest_file_size = sorted_runs_.back().size;
// 计算活跃数据集和稳定数据集大小之比
if (candidate_size * 100 < ratio * earliest_file_size) {
ROCKS_LOG_BUFFER(
log_buffer_,
"[%s] Universal: size amp not needed. newer-files-total-size %" PRIu64
" earliest-file-size %" PRIu64,
cf_name_.c_str(), candidate_size, earliest_file_size);
return nullptr;
} else {
ROCKS_LOG_BUFFER(
log_buffer_,
"[%s] Universal: size amp needed. newer-files-total-size %" PRIu64
" earliest-file-size %" PRIu64,
cf_name_.c_str(), candidate_size, earliest_file_size);
}
...
}
Compaction Triggered by Individual Size Ratio
size_ratio_trigger = (100 + options.compaction_options_universal.size_ratio) / 100
计算 R(i) / sum(R(i-1)),若结果小于 size_ratio_trigger,则执行以序号不大于 i 的所有 sorted run 为输入的 compaction。
1 1 1 1 1 => 5
1 5
1 1 5
1 1 1 5
1 1 1 1 5 => 4 5
1 4 5
1 1 4 5
1 1 1 4 5 => 3 4 5
1 3 4 5
1 1 3 4 5 => 2 3 4 5
以 options.compaction_options_universal.size_ratio = 0,options.level0_file_num_compaction_trigger = 5 参数配置为例,compaction 过程如上所示。
Compaction* UniversalCompactionBuilder::PickCompactionToReduceSortedRuns(
...
// Check if the succeeding files need compaction.
for (size_t i = loop + 1;
candidate_count < max_files_to_compact && i < sorted_runs_.size();
i++) {
const SortedRun* succeeding_sr = &sorted_runs_[i];
if (succeeding_sr->being_compacted) {
break;
}
// Pick files if the total/last candidate file size (increased by the
// specified ratio) is still larger than the next candidate file.
// candidate_size is the total size of files picked so far with the
// default kCompactionStopStyleTotalSize; with
// kCompactionStopStyleSimilarSize, it's simply the size of the last
// picked file.
double sz = candidate_size * (100.0 + ratio) / 100.0;
if (sz < static_cast<double>(succeeding_sr->size)) {
break;
}
if (mutable_cf_options_.compaction_options_universal.stop_style ==
kCompactionStopStyleSimilarSize) {
// Similar-size stopping rule: also check the last picked file isn't
// far larger than the next candidate file.
//计算 R(i) / sum(R(i-1))
sz = (succeeding_sr->size * (100.0 + ratio)) / 100.0;
if (sz < static_cast<double>(candidate_size)) {
// If the small file we've encountered begins a run of similar-size
// files, we'll pick them up on a future iteration of the outer
// loop. If it's some lonely straggler, it'll eventually get picked
// by the last-resort read amp strategy which disregards size ratios.
break;
}
candidate_size = succeeding_sr->compensated_file_size;
} else { // default kCompactionStopStyleTotalSize
candidate_size += succeeding_sr->compensated_file_size;
}
candidate_count++;
}
}
Compaction Triggered by Number of Sorted runs
在每次尝试进行 compaction,sorted run 数量超过 options.level0_file_num_compaction_trigger,但由于不满足上述两种 compaction 的条件而无法进行之前所述的 compaction 时,尝试进行 sorted run 数量不超过 options.compaction_options_universal.max_merge_width 的 compaction,以使得 sorted run 数量少于 options.level0_file_num_compaction_trigger。
网友评论