美文网首页PostgreSQL
PostgreSQL 源码解读(25)- 查询语句#10(查询优

PostgreSQL 源码解读(25)- 查询语句#10(查询优

作者: EthanHe | 来源:发表于2018-08-22 19:51 被阅读7次

    本节简单介绍了PG执行查询语句中优化器部分(Optimizer)的相关函数和数据结构总体说明。查询优化包括查询逻辑优化和查询物理优化,查询逻辑优化是指使用关系代数中的等价规则,通过选择下推、投影下推、连接交换等方法对SQL语句进行优化;查询物理优化是指通过CBO对各种物理访问数据的方法进行评估,得出最优的执行计划。

    一、总体说明

    下面是PG源码目录(/src/backend/optimizer)中的README文件对优化器相关函数和数据结构的总体说明:

    Optimizer Functions
    -------------------
    
    The primary entry point is planner().
    
    planner()//优化器主入口函数
    set up for recursive handling of subqueries//为子查询配置处理器(递归方式)
    -subquery_planner()//调用(子)查询优化函数
     pull up sublinks and subqueries from rangetable, if possible//可以的话,上拉子链接和子查询
     canonicalize qual//表达式规范化
         Attempt to simplify WHERE clause to the most useful form; this includes
         flattening nested AND/ORs and detecting clauses that are duplicated in
         different branches of an OR.//简化WHERE语句
     simplify constant expressions//简化常量表达式
     process sublinks//处理子链接
     convert Vars of outer query levels into Params//转换外查询的Vars变量到Params中
    --grouping_planner()//
      preprocess target list for non-SELECT queries//预处理非SELECT语句的投影列
      handle UNION/INTERSECT/EXCEPT, GROUP BY, HAVING, aggregates,//处理集合操作/聚集函数/排序等
        ORDER BY, DISTINCT, LIMIT
    --query_planner()//
       make list of base relations used in query//构造查询中的基表链表
       split up the qual into restrictions (a=1) and joins (b=c)//拆分表达式为限制条件和连接
       find qual clauses that enable merge and hash joins//查找可以让Merge和Hash连接生效的表达式
    ----make_one_rel()//
         set_base_rel_pathlists()//设置基表路径链表
          find seqscan and all index paths for each base relation//遍历每个基表,寻找顺序扫描和所有可能的索引扫描路径
          find selectivity of columns used in joins//查找连接中使用的列的选择性
         make_rel_from_joinlist()//通过join链表构造Relation
          hand off join subproblems to a plugin, GEQO, or standard_join_search()//
    -----standard_join_search()//标准的连接搜索函数
          call join_search_one_level() for each level of join tree needed//每一个join tree调用join_search_one_level
          join_search_one_level():
            For each joinrel of the prior level, do make_rels_by_clause_joins()//对于上一层的每一个joinrel,执行make_rels_by_clause_joins
            if it has join clauses, or make_rels_by_clauseless_joins() if not.
            Also generate "bushy plan" joins between joinrels of lower levels.
          Back at standard_join_search(), generate gather paths if needed for//回到standard_join_search函数,需要的话,收集相关的路径并应用set_cheapest函数获取代价最小的路径
          each newly constructed joinrel, then apply set_cheapest() to extract
          the cheapest path for it.
          Loop back if this was not the top join level.//如果不是最顶层连接,循环
      Back at grouping_planner://回到grouping_planner函数
      do grouping (GROUP BY) and aggregation//处理分组和聚集
      do window functions//处理窗口函数
      make unique (DISTINCT)//处理唯一性
      do sorting (ORDER BY)//处理排序
      do limit (LIMIT/OFFSET)//处理Limit
    Back at planner()://回到planner函数
    convert finished Path tree into a Plan tree//转换最终的路径树到计划树
    do final cleanup after planning//收尾工作
    
    
    Optimizer Data Structures
    -------------------------
    
    PlannerGlobal   - global information for a single planner invocation//全局优化信息
    
    PlannerInfo     - information for planning a particular Query (we make//某个Planner的优化信息
                      a separate PlannerInfo node for each sub-Query)
    
    RelOptInfo      - a relation or joined relations//某个Relation(包括连接)的优化信息
    
     RestrictInfo   - WHERE clauses, like "x = 3" or "y = z"//限制条件
                      (note the same structure is used for restriction and
                       join clauses)
    
     Path           - every way to generate a RelOptInfo(sequential,index,joins)//构造该关系(注意:中间结果也是关系的一种)的路径
      SeqScan       - represents a sequential scan plan
      IndexPath     - index scan
      BitmapHeapPath - top of a bitmapped index scan
      TidPath       - scan by CTID
      SubqueryScanPath - scan a subquery-in-FROM
      ForeignPath   - scan a foreign table, foreign join or foreign upper-relation
      CustomPath    - for custom scan providers
      AppendPath    - append multiple subpaths together
      MergeAppendPath - merge multiple subpaths, preserving their common sort order
      ResultPath    - a childless Result plan node (used for FROM-less SELECT)
      MaterialPath  - a Material plan node
      UniquePath    - remove duplicate rows (either by hashing or sorting)
      GatherPath    - collect the results of parallel workers
      GatherMergePath - collect parallel results, preserving their common sort order
      ProjectionPath - a Result plan node with child (used for projection)
      ProjectSetPath - a ProjectSet plan node applied to some sub-path
      SortPath      - a Sort plan node applied to some sub-path
      GroupPath     - a Group plan node applied to some sub-path
      UpperUniquePath - a Unique plan node applied to some sub-path
      AggPath       - an Agg plan node applied to some sub-path
      GroupingSetsPath - an Agg plan node used to implement GROUPING SETS
      MinMaxAggPath - a Result plan node with subplans performing MIN/MAX
      WindowAggPath - a WindowAgg plan node applied to some sub-path
      SetOpPath     - a SetOp plan node applied to some sub-path
      RecursiveUnionPath - a RecursiveUnion plan node applied to two sub-paths
      LockRowsPath  - a LockRows plan node applied to some sub-path
      ModifyTablePath - a ModifyTable plan node applied to some sub-path(s)
      LimitPath     - a Limit plan node applied to some sub-path
      NestPath      - nested-loop joins
      MergePath     - merge joins
      HashPath      - hash joins
    
     EquivalenceClass - a data structure representing a set of values known equal//等价类
    
     PathKey        - a data structure representing the sort ordering of a path//排序键
    

    下一节开始将根据总体说明中的函数逐个进行分析解读.

    二、小结

    1、优化器函数总览:大体介绍了优化器函数的调用过程等信息;
    2、数据结构:优化器相关的数据结构,如PlannerInfo等。

    相关文章

      网友评论

        本文标题:PostgreSQL 源码解读(25)- 查询语句#10(查询优

        本文链接:https://www.haomeiwen.com/subject/hhmeiftx.html