美文网首页
HBase表设计 - 盐表Salted Table

HBase表设计 - 盐表Salted Table

作者: 诺之林 | 来源:发表于2020-12-28 16:31 被阅读0次

    本文的主线 概念 => 分区 => 原理 => 优点 => 缺点 => 表格存储

    本文基于Phoenix搭建

    概念

    • HBase sequential write may suffer from region server hotspotting if your row key is monotonically increasing

    • Salting the row key provides a way to mitigate the problem

    • Phoenix provides a way to transparently salt the row key with a salting byte for a particular table. You need to specify this in table creation time by specifying a table property “SALT_BUCKETS” with a value from 1 to 256

    分区

    CREATE TABLE IF NOT EXISTS t_normal (
        id VARCHAR PRIMARY KEY,
        name VARCHAR,
        age INTEGER,
        address VARCHAR
    );
    
    image.png
    UPSERT INTO t_normal VALUES('id1', 'XiaoWang', 22, 'London');
    
    UPSERT INTO t_normal VALUES('id2', 'XiaoWeng', 18, 'New York');
    
    ./hbase-2.0.0/bin/hbase shell
    
    scan 'T_NORMAL'
    
    ROW         COLUMN+CELL
     id1        column=0:\x00\x00\x00\x00, timestamp=1609143024854, value=x
     id1        column=0:\x80\x0B, timestamp=1609143024854, value=XiaoWang
     id1        column=0:\x80\x0C, timestamp=1609143024854, value=\x80\x00\x00\x16
     id1        column=0:\x80\x0D, timestamp=1609143024854, value=London
     id2        column=0:\x00\x00\x00\x00, timestamp=1609143028213, value=x
     id2        column=0:\x80\x0B, timestamp=1609143028213, value=XiaoWeng
     id2        column=0:\x80\x0C, timestamp=1609143028213, value=\x80\x00\x00\x12
     id2        column=0:\x80\x0D, timestamp=1609143028213, value=New York
    2 row(s)
    Took 0.1361 seconds
    
    CREATE TABLE IF NOT EXISTS t_salt (
        id VARCHAR PRIMARY KEY,
        name VARCHAR,
        age INTEGER,
        address VARCHAR
    ) SALT_BUCKETS = 5;
    
    image.png
    UPSERT INTO t_salt VALUES('id1', 'XiaoWang', 22, 'London');
    
    UPSERT INTO t_salt VALUES('id2', 'XiaoWeng', 18, 'New York');
    
    ./hbase-2.0.0/bin/hbase shell
    
    scan 'T_SALT'
    
    ROW         COLUMN+CELL
     \x00id1    column=0:\x00\x00\x00\x00, timestamp=1609143085641, value=x
     \x00id1    column=0:\x80\x0B, timestamp=1609143085641, value=XiaoWang
     \x00id1    column=0:\x80\x0C, timestamp=1609143085641, value=\x80\x00\x00\x16
     \x00id1    column=0:\x80\x0D, timestamp=1609143085641, value=London
     \x01id2    column=0:\x00\x00\x00\x00, timestamp=1609143089175, value=x
     \x01id2    column=0:\x80\x0B, timestamp=1609143089175, value=XiaoWeng
     \x01id2    column=0:\x80\x0C, timestamp=1609143089175, value=\x80\x00\x00\x12
     \x01id2    column=0:\x80\x0D, timestamp=1609143089175, value=New York
    2 row(s)
    Took 0.5924 seconds
    

    原理

    new_row_key = (++index % BUCKETS_NUMBER) + original_row_key
    

    优点

    • Using salted table with pre-split would help uniformly distribute write workload across all the region servers, thus improves the write performance

    • Reading from salted table can also reap benefits from the more uniform distribution of data

    缺点

    • When doing a parallel scan across all region servers, we can take advantage of this properties to perform a merge sort of the client side

    表格存储

    • 主键是数据表中每一行的唯一标识 主键由1到4个主键列组成

    • 组成主键的第一个主键列又称为分区键

    表格存储会根据数据表中每一行分区键的值所属的范围自动将一行数据分配到对应的分区和机器上 以达到负载均衡的目的
    

    参考

    相关文章

      网友评论

          本文标题:HBase表设计 - 盐表Salted Table

          本文链接:https://www.haomeiwen.com/subject/yevjnktx.html