美文网首页
greenplum tpch测试记录

greenplum tpch测试记录

作者: nagioswork | 来源:发表于2016-08-19 17:44 被阅读1479次

    1、到tpc组织下载dbgen
    http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp
    需要注册一下,tpc会发生一封邮件给出一个下载地址
    2、编译tpch和生成测试数据
    <pre>
    [gpadmin@gp-node0 dbgen]$ unzip c1ac817c-d2f5-4a24-b7bd-c1f84e75949e-tpc-h-tool.zip
    [gpadmin@gp-node0 tpch]$ cd tpch_2_17_0/
    [gpadmin@gp-node0 tpch_2_17_0]$ ll
    total 8
    drwxrwxr-x. 8 gpadmin gpadmin 4096 May 5 2014 dbgen
    drwxrwxr-x. 2 gpadmin gpadmin 33 May 5 2014 dev-tools
    drwxrwxr-x. 10 gpadmin gpadmin 4096 May 5 2014 ref_data
    [gpadmin@gp-node0 tpch_2_17_0]$ cd dbgen/
    [gpadmin@gp-node0 dbgen]$ ll

    [gpadmin@gp-node0 dbgen]$cp makefile.suite Makefile

    [gpadmin@gp-node0 dbgen]$vi Makefile
    第109行,设置
    CC = gcc

    DATABASE= ORACLE
    MACHINE = LINUX
    WORKLOAD = TPCH

    [gpadmin@gp-node0 dbgen]$ make
    生成sf=100的数据
    [gpadmin@gp-node0 dbgen]$ ./dbgen -s 100 -f
    TPC-H Population Generator (Version 2.17.0)
    Copyright Transaction Processing Performance Council 1994 - 2010

    </pre>
    3、将测试数据转换为GP识别的格式,删除末尾的分隔符|。
    <pre>
    $for i in ls *.tbl; do sed 's/|$//' $i > ${i/tbl/csv}; done
    </pre>
    4、下载tpch,采用digoal修改过适应GP的脚本
    <pre>
    git clone https://github.com/digoal/pg_tpch.git
    cd pg_tpch
    </pre>
    修改tpch-load.sql中指向的文件路径
    采用vi查找替换命令
    :%s+/tmp/dss-data+/data/tpch/csv100+
    <pre>
    [gpadmin@gp-node0 dss]$ more tpch-load.sql
    BEGIN;

        CREATE TABLE PART (
                P_PARTKEY               SERIAL8,
                P_NAME                  VARCHAR(55),
                P_MFGR                  CHAR(25),
                P_BRAND                 CHAR(10),
                P_TYPE                  VARCHAR(25),
                P_SIZE                  INTEGER,
                P_CONTAINER             CHAR(10),
                P_RETAILPRICE   DECIMAL,
                P_COMMENT               VARCHAR(23)
        ) with (APPENDONLY=true,BLOCKSIZE=2097152,ORIENTATION=COLUMN,CHECKSUM=true,OIDS=false) DISTRIBUTED BY (p_partkey);
    
        COPY part FROM '/data/tpch/csv100/part.csv' WITH csv DELIMITER '|';
    

    COMMIT;
    ......
    </pre>
    将pg_tpch的文件都拷贝到dbgen所在的目录:
    <pre>
    $cp -r pg_tpch/* tpch/tpch_2_17_0/dbgen/
    $cd tpch/tpch_2_17_0/dbgen
    </pre>
    创建一个queries目录,用于存放转换后的tpc-h 测试SQL。
    <pre>
    [gpadmin@gp-node0 dbgen]$ SF=100
    [gpadmin@gp-node0 dbgen]$ mkdir dss/queries
    mkdir: cannot create directory ‘dss/queries’: File exists
    [gpadmin@gp-node0 dbgen]$ for q in seq 1 22

    do
    DSS_QUERY=dss/templates ./qgen -s $SF $q > dss/queries/$q.sql
    sed 's/^select/explain select/' dss/queries/$q.sql > dss/queries/$q.explain.sql
    done
    [gpadmin@gp-node0 dbgen]$
    </pre>
    5、创建数据库用户
    <pre>
    [gpadmin@gp-node0 pg_tpch]$ psql tpch
    psql (8.2.15)
    Type "help" for help.

    tpch=# CREATE ROLE "tpchuser" login PASSWORD 'tpchuser' SUPERUSER NOINHERIT;
    </pre>
    设置几个参数:
    <pre>
    [gpadmin@gp-node0 pg_tpch]$ gpconfig -c enable_nestloop -v off
    20160819:17:13:44:021871 gpconfig:gp-node0:gpadmin-[INFO]:-completed successfully
    [gpadmin@gp-node0 pg_tpch]$ gpconfig -c work_mem -v 256MB
    20160819:17:13:57:022043 gpconfig:gp-node0:gpadmin-[INFO]:-completed successfully
    [gpadmin@gp-node0 pg_tpch]$ gpstop -u
    20160819:17:14:00:022187 gpstop:gp-node0:gpadmin-[INFO]:-Starting gpstop with args: -u
    20160819:17:14:00:022187 gpstop:gp-node0:gpadmin-[INFO]:-Gathering information and validating the environment...
    20160819:17:14:00:022187 gpstop:gp-node0:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
    20160819:17:14:00:022187 gpstop:gp-node0:gpadmin-[INFO]:-Obtaining Segment details from master...
    20160819:17:14:02:022187 gpstop:gp-node0:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 4.3.9.0 build 1'
    20160819:17:14:02:022187 gpstop:gp-node0:gpadmin-[INFO]:-Signalling all postmaster processes to reload
    ..
    [gpadmin@gp-node0 pg_tpch]$
    </pre>
    6、执行测试
    <pre>
    [gpadmin@gp-node0 dbgen]$ ./tpch.sh ./results tpch tpchtest
    vmstat: /usr/local/greenplum-db/./lib/libz.so.1: no version information available (required by /lib64/libdw.so.1)

    ./tpch.sh: line 166: kill: (30294) - No such process
    ./tpch.sh: line 189: 24564 Terminated vmstat $DELAY >> $RESULTS/vmstat.log
    </pre>
    7、使用以下方法生成CSV报告。
    <pre>
    [gpadmin@gp-node0 dbgen]$ php process.php ./results output.csv
    input directory: ./results
    output file: output.csv
    [root@gp-node0 dbgen]# vi output.csv
    tpch_load;tpch_pkeys;tpch_fkeys;tpch_indexes;tpch_analyze;tpch_total;query_1;query_2;query_3;query_4;query_5;query_6;query_7;query_8;query_9;query_10;query_11;query_12;query_13;query_14;query_15;query_16;query_17;query_18;query_19;query_20;query_21;query_2
    2;query_1_hash;query_2_hash;query_3_hash;query_4_hash;query_5_hash;query_6_hash;query_7_hash;query_8_hash;query_9_hash;query_10_hash;query_11_hash;query_12_hash;query_13_hash;query_14_hash;query_15_hash;query_16_hash;query_17_hash;query_18_hash;query_19_ha
    sh;query_20_hash;query_21_hash;query_22_hash;db_cache_hit_ratio
    4.00;0.00;1.00;0.00;127.00;763.00;40.55;10.04;34.40;2.32;3.09;0.02;89.88;27.56;100.19;3.64;5.09;1.33;16.33;1.02;1.04;8.66;87.57;131.51;38.19;5.01;128.12;11.84;afc07662ad7b9f84497d46ec3e2982ca;58c9a2cadeac374f8d4f140f99edb6aa;2fc3f1557173580fea23723865a4b68
    e;cd1b53d233e808cdcdedbcaaa1878596;65363ff4ff7557e250b34f7600b28c03;064cb1b93292905a03b1ae773255c930;9af0a7cf3c4836428ea37bc04f73f9a5;51c9d36bdece8ebf910c55547c07a6cf;fc1f609fda8b3c83013ebc33e7daeb68;1d90d4fdce96f68f67ef2ca88376f969;c2994eb3044e9ad1668608a
    ef389f655;70eee67cbec1a9e64f3b55902c498dfb;509f9f92f0c59402422c8c6f82195134;fabf21ebb069cabf9d211643fe29bedf;734d0ea2c20114d51cc7b2f27c96b687;fe286ea7839d129719aff3afd4bb94b4;90dd1b04700d9154a2d741e090c4c709;146b5bc30d8f7ba184b7a81c360676b1;660bd16ff9a6f85
    1d853e1d5530cd61f;12558d5227c3d83a0043255f46b351a2;6ec86cb3d0faa18fbae06cc8dcd5a245;7a6756e65d19c578f988901faa5ef3b4;0.00;
    </pre>

    8、php安装
    <pre>
    [root@gp-node0 dbgen]# yum -y install php
    </prte>
    如果“ImportError: No module named site”的错误,是因为greenplum修改了Python的PATH
    设置一下环境变量:
    export PYTHONPATH=/usr/lib64/python2.7:$PYTHONPATH

    相关文章

      网友评论

          本文标题:greenplum tpch测试记录

          本文链接:https://www.haomeiwen.com/subject/ynvbsttx.html