首先咱们找到的 Star Schema Benchmark 的描述文档(点击可下载),咱们最开始找着的 Starrock 官网的一份关于 Star Schema Benchmark 的测试报告。
主要参考这两篇文档,修改部分 SQL 关键字来建数据库的的模式(Schema)和表(Table):
CREATE USER ssb100 IDENTIFIED BY ssb1;
CREATE SCHEMA ssb100;
GRANT ALL ON ssb100.* TO ssb100;
CREATE DIMENSION TABLE CUSTOMER(
C_CUSTKEY INTEGER PRIMARY KEY,
C_NAME VARCHAR(25),
C_ADDRESS VARCHAR(25),
C_CITY VARCHAR(10),
C_NATION VARCHAR(15),
C_REGION VARCHAR(12),
C_PHONE VARCHAR(15),
C_MKTSEGMENT VARCHAR(10)
)
CREATE DIMENSION TABLE DATES(
D_DATEKEY INTEGER PRIMARY KEY,
D_DATE VARCHAR(18),
D_DAYOFWEEK VARCHAR(18),
D_MONTH VARCHAR(9),
D_YEAR INTEGER,
D_YEARMONTHNUM INTEGER,
D_YEARMONTH VARCHAR(7),
D_DAYNUMINWEEK INTEGER,
D_DAYNUMINMONTH INTEGER,
D_DAYNUMINYEAR INTEGER,
D_MONTHNUMINYEAR INTEGER,
D_WEEKNUMINYEAR INTEGER,
D_SELLINGSEASON VARCHAR(12),
D_LASTDAYINWEEKFL INTEGER,
D_LASTDAYINMONTHFL INTEGER,
D_HOLIDAYFL INTEGER,
D_WEEKDAYFL INTEGER
)
CREATE DIMENSION TABLE PART(
P_PARTKEY INTEGER PRIMARY KEY,
P_NAME VARCHAR(22) ,
P_MFGR VARCHAR(6),
P_CATEGORY VARCHAR(7),
P_BRAND VARCHAR(9),
P_COLOR VARCHAR(11),
P_TYPE VARCHAR(25),
P_SIZE TINYINT,
P_CONTAINER VARCHAR(10)
)
CREATE DIMENSION TABLE SUPPLIER(
S_SUPPKEY INTEGER PRIMARY KEY,
S_NAME VARCHAR(25),
S_ADDRESS VARCHAR(25),
S_CITY VARCHAR(10),
S_NATION VARCHAR(15),
S_REGION VARCHAR(12),
S_PHONE VARCHAR(15)
)
CREATE TABLE LINEORDER(
LO_ORDERKEY INTEGER,
LO_LINENUMBER TINYINT,
LO_CUSTKEY INTEGER REFERENCES CUSTOMER,
LO_PARTKEY INTEGER REFERENCES PART,
LO_SUPPKEY INTEGER REFERENCES SUPPLIER,
LO_ORDERDATE INTEGER REFERENCES DATES,
LO_ORDERPRIOTITY VARCHAR(15),
LO_SHIPPRIOTITY TINYINT,
LO_QUANTITY TINYINT,
LO_EXTENDEDPRICE INTEGER,
LO_ORDTOTALPRICE INTEGER,
LO_DISCOUNT TINYINT,
LO_REVENUE INTEGER,
LO_SUPPLYCOST INTEGER,
LO_TAX TINYINT,
LO_COMMITDATE INTEGER REFERENCES DATES,
LO_SHIPMODE VARCHAR(10)
)
已修改内容如下:
- PART.P_SIZE 从
INTEGER
改成了TINYINT
- CUSTOMER.C_ADDRESS 从
VARCHAR(40)
改成了VARCHAR(25)
(描述文档中的该字段最大长度是25) - LINEORDER 所有的BIGINT 类型结合 starrocks的建表语句和starschema文档改成了
INTEGER
和TINYINT
(是有合适的数据类型) - LINEORDER 新增外键关联(描述文档中,是有外键关联的,得加上)
- 外键关联只关联表不写具体的字段(数据库特性,单一主键,无需指定列名)
- NOT NULL和多余的空格都去了(简化 SQL 文本)
零优化之前,某硬件配置下(硬件资源很低,暂不公布)所有SQL的查询时间总计是2800s
网友评论