Cacite官方介绍
https://www.slideshare.net/julianhyde/streaming-sql-63554778
1. 解析SQL并生成逻辑计划
以简单的SQL查询举例:
create table ds1 (c1 BIGINT ,c2 VARCHAR )
insert into ds1(c1,c2) values(1,'a1')
insert into ds1(c1,c2) values(2,'a2')
select c1,c2 from ds1 where c1>1
生成LogicPlan:
LogicalProject(C1=[$0], C2=[$1])
LogicalFilter(condition=[>($0, 1)])
EnumerableTableScan(table=[[DEFAULT_SCH, DS1]])
关键代码调用流程:
Connection.parseQuery=>SqlNode
SqlToRelConverter.convertQuery(SqlNode)=>RelRoot(RelNode)
2.优化逻辑计划产生对应的物理计划
关键代码调用流程:
Prepare.optimize(RelRoot)=>RelRoot
Calcite的本地实现预定义了一组优化Programs,参考Programs.standard() ,program.run()
program里的优化计划包括两种,HepPlanner和VolcanoPlanner,前者基于规则匹配优化,
后者基于代价优化(cost based optimization /CBO)
EnumerableCalc(expr#0..1=[{inputs}], expr#2=[1], expr#3=[>($t0, $t2)], proj#0..1=[{exprs}], $condition=[$t3]): rowcount = 25.0, cumulative cost = {125.0 rows, 801.0 cpu, 0.0 io}, id = 61
EnumerableTableScan(table=[[DEFAULT_SCH, DS1]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 36
技巧:可以使用RelOptUtil.toString(rel)来查看RelNode树的详情
RelOptUtil.toString(rel, SqlExplainLevel.NO_ATTRIBUTES)
3.物理计划生成执行代码
关键代码调用流程:
EnumerableInterpretable.toBindable(rel)
执行代码实现使用了作者的另一个项目linq4j
Apache Calcite 中文交流群
calcite.jpg参考
https://www.jianshu.com/p/a6134865adf6
http://www.liaojiayi.com/2018/07/20/calcite/
https://blog.csdn.net/huxuanlai/article/details/59511315
https://www.jianshu.com/p/2dfbd71b7f0f
http://hbasefly.com/2017/03/01/sparksql-catalyst/
https://blog.csdn.net/wangxingxing2006/article/details/78907278
网友评论