PostgreSQL 源码解读（25）- 查询语句#10（查询优化概览）-白红宇

PostgreSQL 源码解读（25）- 查询语句#10（查询优化概览）

阅读量：2498 次

发布时间：2019-05-11

本文共 5478 字，大约阅读时间需要 18 分钟。

本节简单介绍了PG执行查询语句中优化器部分（Optimizer）的相关函数和数据结构总体说明。查询优化包括查询逻辑优化和查询物理优化，查询逻辑优化是指使用关系代数中的等价规则，通过选择下推、投影下推、连接交换等方法对SQL语句进行优化;查询物理优化是指通过CBO对各种物理访问数据的方法进行评估，得出最优的执行计划。

一、总体说明

下面是PG源码目录(/src/backend/optimizer)中的README文件对优化器相关函数和数据结构的总体说明:

Optimizer Functions-------------------The primary entry point is planner().planner()//优化器主入口函数set up for recursive handling of subqueries//为子查询配置处理器(递归方式)-subquery_planner()//调用(子)查询优化函数 pull up sublinks and subqueries from rangetable, if possible//可以的话,上拉子链接和子查询 canonicalize qual//表达式规范化     Attempt to simplify WHERE clause to the most useful form; this includes     flattening nested AND/ORs and detecting clauses that are duplicated in     different branches of an OR.//简化WHERE语句 simplify constant expressions//简化常量表达式 process sublinks//处理子链接 convert Vars of outer query levels into Params//转换外查询的Vars变量到Params中--grouping_planner()//  preprocess target list for non-SELECT queries//预处理非SELECT语句的投影列  handle UNION/INTERSECT/EXCEPT, GROUP BY, HAVING, aggregates,//处理集合操作/聚集函数/排序等    ORDER BY, DISTINCT, LIMIT--query_planner()//   make list of base relations used in query//构造查询中的基表链表   split up the qual into restrictions (a=1) and joins (b=c)//拆分表达式为限制条件和连接   find qual clauses that enable merge and hash joins//查找可以让Merge和Hash连接生效的表达式----make_one_rel()//     set_base_rel_pathlists()//设置基表路径链表      find seqscan and all index paths for each base relation//遍历每个基表,寻找顺序扫描和所有可能的索引扫描路径      find selectivity of columns used in joins//查找连接中使用的列的选择性     make_rel_from_joinlist()//通过join链表构造Relation      hand off join subproblems to a plugin, GEQO, or standard_join_search()//-----standard_join_search()//标准的连接搜索函数      call join_search_one_level() for each level of join tree needed//每一个join tree调用join_search_one_level      join_search_one_level():        For each joinrel of the prior level, do make_rels_by_clause_joins()//对于上一层的每一个joinrel,执行make_rels_by_clause_joins        if it has join clauses, or make_rels_by_clauseless_joins() if not.        Also generate "bushy plan" joins between joinrels of lower levels.      Back at standard_join_search(), generate gather paths if needed for//回到standard_join_search函数,需要的话,收集相关的路径并应用set_cheapest函数获取代价最小的路径      each newly constructed joinrel, then apply set_cheapest() to extract      the cheapest path for it.      Loop back if this was not the top join level.//如果不是最顶层连接,循环  Back at grouping_planner://回到grouping_planner函数  do grouping (GROUP BY) and aggregation//处理分组和聚集  do window functions//处理窗口函数  make unique (DISTINCT)//处理唯一性  do sorting (ORDER BY)//处理排序  do limit (LIMIT/OFFSET)//处理LimitBack at planner()://回到planner函数convert finished Path tree into a Plan tree//转换最终的路径树到计划树do final cleanup after planning//收尾工作Optimizer Data Structures-------------------------PlannerGlobal   - global information for a single planner invocation//全局优化信息PlannerInfo     - information for planning a particular Query (we make//某个Planner的优化信息                  a separate PlannerInfo node for each sub-Query)RelOptInfo      - a relation or joined relations//某个Relation(包括连接)的优化信息 RestrictInfo   - WHERE clauses, like "x = 3" or "y = z"//限制条件                  (note the same structure is used for restriction and                   join clauses) Path           - every way to generate a RelOptInfo(sequential,index,joins)//构造该关系(注意:中间结果也是关系的一种)的路径  SeqScan       - represents a sequential scan plan  IndexPath     - index scan  BitmapHeapPath - top of a bitmapped index scan  TidPath       - scan by CTID  SubqueryScanPath - scan a subquery-in-FROM  ForeignPath   - scan a foreign table, foreign join or foreign upper-relation  CustomPath    - for custom scan providers  AppendPath    - append multiple subpaths together  MergeAppendPath - merge multiple subpaths, preserving their common sort order  ResultPath    - a childless Result plan node (used for FROM-less SELECT)  MaterialPath  - a Material plan node  UniquePath    - remove duplicate rows (either by hashing or sorting)  GatherPath    - collect the results of parallel workers  GatherMergePath - collect parallel results, preserving their common sort order  ProjectionPath - a Result plan node with child (used for projection)  ProjectSetPath - a ProjectSet plan node applied to some sub-path  SortPath      - a Sort plan node applied to some sub-path  GroupPath     - a Group plan node applied to some sub-path  UpperUniquePath - a Unique plan node applied to some sub-path  AggPath       - an Agg plan node applied to some sub-path  GroupingSetsPath - an Agg plan node used to implement GROUPING SETS  MinMaxAggPath - a Result plan node with subplans performing MIN/MAX  WindowAggPath - a WindowAgg plan node applied to some sub-path  SetOpPath     - a SetOp plan node applied to some sub-path  RecursiveUnionPath - a RecursiveUnion plan node applied to two sub-paths  LockRowsPath  - a LockRows plan node applied to some sub-path  ModifyTablePath - a ModifyTable plan node applied to some sub-path(s)  LimitPath     - a Limit plan node applied to some sub-path  NestPath      - nested-loop joins  MergePath     - merge joins  HashPath      - hash joins EquivalenceClass - a data structure representing a set of values known equal//等价类 PathKey        - a data structure representing the sort ordering of a path//排序键

下一节开始将根据总体说明中的函数逐个进行分析解读.

二、小结

1、优化器函数总览：大体介绍了优化器函数的调用过程等信息；

2、数据结构：优化器相关的数据结构，如PlannerInfo等。

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/6906/viewspace-2374891/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/6906/viewspace-2374891/

你可能感兴趣的文章