美文网首页
HBase表设计 - 二级索引Secondary Indexin

HBase表设计 - 二级索引Secondary Indexin

作者: 诺之林 | 来源:发表于2020-12-22 11:49 被阅读0次

    本文的主线 概念 => 主键 => 配置 => 索引

    本文基于HBase表设计 - 生存时间Time To Live

    概念

    • Secondary indexes are an orthogonal way to access data from its primary access path.

    • In HBase, you have a single index that is lexicographically sorted on the primary row key. Access to records in any way other than through the primary row requires scanning over potentially all the rows in the table to test them against your filter.

    • With secondary indexing, the columns or expressions you index form an alternate row key to allow point lookups and range scans along this new axis.

    主键

    • 单一主键
    CREATE TABLE IF NOT EXISTS t_gps (
        row_key VARCHAR NOT NULL PRIMARY KEY,
        id BIGINT,
        city VARCHAR,
        created_ts BIGINT
    );
    
    UPSERT INTO t_gps VALUES('3001-1608605947',3001,'beijing',1608605947);
    
    UPSERT INTO t_gps VALUES('3001-1608615947',3001,'shanghai',1608615947);
    
    EXPLAIN SELECT * FROM t_gps WHERE row_key > '3001-1608605947';
    
    +--------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    |                                            PLAN                                            | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
    +--------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER T_GPS ['3001-1608605947'] - [*]  | null            | null           | null         |
    +--------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    1 row selected (0.018 seconds)
    
    • 复合主键
    DROP TABLE t_gps;
    
    CREATE TABLE IF NOT EXISTS t_gps (
        id BIGINT NOT NULL,
        city VARCHAR,
        created_ts BIGINT NOT NULL,
        CONSTRAINT pk PRIMARY KEY (id,created_ts)
    );
    
    UPSERT INTO t_gps VALUES(3001,'beijing',1608605947);
    
    UPSERT INTO t_gps VALUES(3001,'shanghai',1608615947);
    
    EXPLAIN SELECT * FROM t_gps WHERE id = 3001 AND created_ts > 1608605947;
    
    +-----------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    |                                             PLAN                                              | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
    +-----------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER T_GPS [3001,1608605948] - [3001,*]  | null            | null           | null         |
    +-----------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    1 row selected (0.017 seconds)
    

    配置

    vim ~/hbase-2.0.0/conf/hbase-site.xml
    
    <configuration>
        <property>
            <name>hbase.regionserver.wal.codec</name>
            <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
        </property>
    </configuration>
    
    ~/hbase-2.0.0/bin/stop-hbase.sh
    
    ~/hbase-2.0.0/bin/start-hbase.sh
    

    索引

    EXPLAIN SELECT * FROM t_gps WHERE created_ts > 1608605947;
    
    +-----------------------------------------------------------------+-----------------+----------------+--------------+
    |                              PLAN                               | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
    +-----------------------------------------------------------------+-----------------+----------------+--------------+
    | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER T_GPS  | null            | null           | null         |
    |     SERVER FILTER BY CREATED_TS > 1608605947                    | null            | null           | null         |
    +-----------------------------------------------------------------+-----------------+----------------+--------------+
    2 rows selected (0.017 seconds)
    
    CREATE INDEX IF NOT EXISTS ts_idx ON t_gps (created_ts DESC);
    
    EXPLAIN SELECT * FROM t_gps WHERE created_ts > 1608605947;
    
    +-----------------------------------------------------------------+-----------------+----------------+--------------+
    |                              PLAN                               | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
    +-----------------------------------------------------------------+-----------------+----------------+--------------+
    | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER T_GPS  | null            | null           | null         |
    |     SERVER FILTER BY CREATED_TS > 1608605947                    | null            | null           | null         |
    +-----------------------------------------------------------------+-----------------+----------------+--------------+
    2 rows selected (0.018 seconds)
    
    DROP INDEX ts_idx ON t_gps;
    
    CREATE INDEX IF NOT EXISTS ts_idx ON t_gps (created_ts DESC) INCLUDE (id, city);
    
    EXPLAIN SELECT * FROM t_gps WHERE created_ts > 1608605947;
    
    +---------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    |                                         PLAN                                          | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
    +---------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TS_IDX [*] - [~1608605947]  | null            | null           | null         |
    +---------------------------------------------------------------------------------------+-----------------+----------------+--------------+
    1 row selected (0.023 seconds)
    

    参考

    相关文章

      网友评论

          本文标题:HBase表设计 - 二级索引Secondary Indexin

          本文链接:https://www.haomeiwen.com/subject/hyvjnktx.html