PostgreSQL索引(4) – B-tree

2020-10-01 孟庆钟

本文翻译自Egor Rogov的英文博客，且已征得Egor Rogov的同意。译者做了少量修改。

审校：纪昀红，中国人民大学信息学院在读博士生。

我们已经介绍了PG的通用索引引擎、接口、Hash索引。本文将介绍B-tree，它是最传统、使用最广泛的索引类型。

1. 结构

在PG中，B-tree索引被实现为B-tree AM，它适合可以被排序的数据。也就是说，必须给相应的数据类型定义下面5个操作符：大于、大于等于、等于、小于等于、小于。

B-tree索引的行数据被组织成页面。叶子页面中的行数据，存放的是<key, TID>键值对。非叶子页面中存放的是<minvalue, childptr>键值对，childptr指向本页面的孩子页面（可以是叶子页面，也可是其它非叶子页面），minvalue是子页面中的最小值。

下文也将页面称为节点。

B-tree具有几个重要的特点：

它是一颗平衡树。根节点到所有叶子节点的距离都相等，即从根节点到任何叶子节点所经过的非叶节点数量都相等。因此，查找任意的值，需要的时间都相同。
它是一颗多叉树。每个节点（通常为8KB的页面）都包含大量（几百个）TID。因此B-tree的高度通常比较小，对一张很大的表来说，高度实际只有4-5层。
索引的行数据在索引中以非降序的顺序排列（无论页面内部，还是页面之间），同层的页面之间使用一个双向链表链接起来。因此，我们可以通过直接遍历链表得到有序的数据，而不必每次都访问根页面。

下面是一个索引的示例，这是一个整数列上的索引。

索引文件中的第0页是元页面，它指向根页面。非叶节点在根节点之下，叶节点在最底层。图中向下的箭头表示叶节点中指向基表的TID。

1.1 等值查询

下面介绍使用indexed-field = expression作为条件搜索B-tree的过程。假设被查找的key是49。

查询从根节点开始，然后需要确定向下查找哪个子节点。由根节点中的key可知，应该查询第2个子节点，因为32 ≤ 49 < 64。使用类似的方式，可以找到叶节点，进而找到所需的TID。

实际上，一些特殊情况使这个过程变得很复杂。例如，索引中可能有很多重复值，并且这些重复值不能被容纳到同一个页面中。为了不漏掉某些值，我们必须下降到严格小于49的key所指的孩子节点。即，从小于49的key中，选取最大的一个，下降到这个key指向的孩子节点，遍历这个孩子节点。

查找过程中，索引页面可能会发生分裂，这也使查找过程变得复杂。所有的算法都致力于在这种场景下提高并发。本文不讨论这些内容。

1.2 非等值查询

当使用indexed-field ≤ expression或indexed-field ≥ expression作为条件查询时，我们先找到一个边界值，使它满足indexed-field = expression，然后按照合适的顺序遍历叶子页面即可。

下面的示例描述了查找n ≤ 35的过程：

大与和小于的查找过程和上面的过程类似，只需要丢弃边界值即可。

1.3 范围查询

当使用expression1 ≤ indexed-field ≤ expression2作为条件查询时，我们可以先找到expression1，然后按顺序遍历叶子节点，直到indexed-field ≤ expression2不再满足。同理，也可以先找到expression2，然后反向扫描。

下面的示例展示了使用23 ≤ n ≤ 64作为条件的查找过程：

2. 示例

我们看一下查询计划的模样。本文仍然使用上一篇提到的demo数据库，这次给aircraft表建立索引。这个表中只有9行数据，优化器默认不选索引扫描，因为所有的数据都在一个页面中，使用顺序扫描效率最高。但是这张表非常适合用来演示。

select * from aircrafts;

 aircraft_code |        model        | range 
---------------+---------------------+-------
 773           | Boeing 777-300      | 11100
 763           | Boeing 767-300      |  7900
 SU9           | Sukhoi SuperJet-100 |  3000
 320           | Airbus A320-200     |  5700
 321           | Airbus A321-200     |  5600
 319           | Airbus A319-100     |  6700
 733           | Boeing 737-300      |  4200
 CN1           | Cessna 208 Caravan  |  1200
 CR2           | Bombardier CRJ-200  |  2700
(9 rows)

create index on aircrafts(range);
set enable_seqscan = off;

也可以显式指定索引类型：

create index on aircrafts using btree(range);

等值查询：

explain(costs off) select * from aircrafts where range = 3000;

                    QUERY PLAN                     
---------------------------------------------------
 Index Scan using aircrafts_range_idx on aircrafts
   Index Cond: (range = 3000)
(2 rows)

非等值查询：

explain(costs off) select * from aircrafts where range < 3000;

                     QUERY PLAN                    
---------------------------------------------------
 Index Scan using aircrafts_range_idx on aircrafts
   Index Cond: (range < 3000) 
(2 rows)

范围查询：

explain(costs off) select * from aircrafts
where range between 3000 and 5000;

                     QUERY PLAN                      
-----------------------------------------------------
 Index Scan using aircrafts_range_idx on aircrafts
   Index Cond: ((range >= 3000) AND (range <= 5000))
(2 rows)

3. 排序

使用B-tree索引进行Index Scan、Index Only Scan时，可以按顺序返回数据。所以，如果表在排序条件上有索引，优化器有两种选择：1)直接扫描索引，按顺序返回结果；2)顺序扫描，然后对结果进行排序。

3.1 索引列值排序的顺序

在创建索引时，我们可以显式指定排序的顺序，比如：

create index on aircrafts(range desc);

这种情况下，较大的值在树的左边，较小的值在右边。

上文提到，B-tree可以支持正向扫描和反向扫描，显式指定排序方式有什么用呢？

这对多列索引可能有用。

我们首先建立一个视图，它将飞机按续航里程划分为短程、中程、远程三种类型。下文的查询将使用这个视图，以便简化SQL语句。

create view aircrafts_v as
select model,
       case
           when range < 4000 then 1
           when range < 10000 then 2
           else 3
       end as class
from aircrafts;

select * from aircrafts_v;

        model        | class
---------------------+-------
 Boeing 777-300      |     3
 Boeing 767-300      |     2
 Sukhoi SuperJet-100 |     1
 Airbus A320-200     |     2
 Airbus A321-200     |     2
 Airbus A319-100     |     2
 Boeing 737-300      |     2
 Cessna 208 Caravan  |     1
 Bombardier CRJ-200  |     1
(9 rows)

因为视图中使用了表达式，因此我们建立一个表达式索引：

create index on aircrafts(
  (case when range < 4000 then 1
        when range < 10000 then 2
        else 3 end),
  model);

下面对视图中的两列按递增顺序排序，可以看出，这个查询可以使用索引：

select class, model from aircrafts_v order by class, model;

 class |        model        
-------+---------------------
     1 | Bombardier CRJ-200
     1 | Cessna 208 Caravan
     1 | Sukhoi SuperJet-100
     2 | Airbus A319-100
     2 | Airbus A320-200
     2 | Airbus A321-200
     2 | Boeing 737-300
     2 | Boeing 767-300
     3 | Boeing 777-300
(9 rows)

explain(costs off)
select class, model from aircrafts_v order by class, model;

                       QUERY PLAN                       
--------------------------------------------------------
 Index Scan using aircrafts_case_model_idx on aircrafts
(1 row)

同理，对这两列按递减顺序排序，也可以使用索引：

select class, model from aircrafts_v order by class desc, model desc;

 class |        model        
-------+---------------------
     3 | Boeing 777-300
     2 | Boeing 767-300
     2 | Boeing 737-300
     2 | Airbus A321-200
     2 | Airbus A320-200
     2 | Airbus A319-100
     1 | Sukhoi SuperJet-100
     1 | Cessna 208 Caravan
     1 | Bombardier CRJ-200
(9 rows)

explain(costs off)
select class, model from aircrafts_v order by class desc, model desc;

                           QUERY PLAN                            
-----------------------------------------------------------------
 Index Scan BACKWARD using aircrafts_case_model_idx on aircrafts
(1 row)

但是，如果对一列按递增顺序排序，对另一列按递减顺序排序，则无法使用这个索引：

explain(costs off)
select class, model from aircrafts_v order by class ASC, model DESC;

                   QUERY PLAN                    
-------------------------------------------------
 Sort
   Sort Key: (CASE ... END), aircrafts.model DESC
   ->  Seq Scan on aircrafts
(3 rows)

可以看出，虽然禁用了顺序扫描，但是优化器还是选择了顺序扫描。这是因为，所谓的禁用顺序扫描，PG内部仅仅将顺序扫描的代价调高，使得优化器尽量不选顺序扫描，而不是不生成顺序扫描的计划。当没有更优的计划时，即使禁用顺序扫描，优化器仍然会选择顺序扫描。

为了使用索引，必须让索引列按照指定的顺序排序：

create index aircrafts_case_asc_model_desc_idx on aircrafts(
 (case
    when range < 4000 then 1
    when range < 10000 then 2
    else 3
  end) ASC,
  model DESC);

explain(costs off)
select class, model from aircrafts_v order by class ASC, model DESC;

                           QUERY PLAN                            
-----------------------------------------------------------------
 Index Scan using aircrafts_case_asc_model_desc_idx on aircrafts
(1 row)

3.2 索引定义中列的顺序

多列索引中的值根据下面的方式排序：先比较第一个字段；第一个字段相同时，再比较第二个字段；以此类推。就像对字符串比较大小一样。

在上面的索引中，使用class = 3（仅比较第一个字段）或class = 3 and model = ‘Boeing 777-300’查询索引都会非常高效。

但是当仅使用model = ‘Boeing 777-300’一个条件查询时就会非常低效，优化器在这种情况下，也不会选择索引扫描。因为第一个索引列没有被使用。这并不难理解，在此不再赘述。

如果想让model = ‘Boeing 777-300’条件使用索引，必须在建立索引时调换索引列的顺序：

create index on aircrafts(
  model,
  (case when range < 4000 then 1 when range < 10000 then 2 else 3 end));

这时，使用model = ‘Boeing 777-300’查询时，效率就会很高。但是仅使用class = 3一个条件查询时效率就低了。

3.3 NULL值

B-tree索引会索引NULL值，并且支持IS NULL和IS NOT NULL作为条件的查询。

例如：

create index on flights(actual_arrival);
explain(costs off) select * from flights where actual_arrival is null;

                      QUERY PLAN                       
-------------------------------------------------------
 Bitmap Heap Scan on flights
   Recheck Cond: (actual_arrival IS NULL)
   ->  Bitmap Index Scan on flights_actual_arrival_idx
         Index Cond: (actual_arrival IS NULL)
(4 rows)

所有的NULL值都在叶子节点的最左端或最右端，这取决于索引是如何创建的（NULLS FIRST还是NULLS LAST）。当查询包含排序时，这很重要：如果查询指定的NULL值的排序方式与索引中NULL的存放方式（NULLS FIRST还是NULLS LAST）一致，则可以使用这个索引。如下所示：

explain(costs off)
select * from flights order by actual_arrival NULLS LAST;

                       QUERY PLAN                      
--------------------------------------------------------
 Index Scan using flights_actual_arrival_idx on flights
(1 row)

否则，无法使用索引，如下所示：

explain(costs off)
select * from flights order by actual_arrival NULLS FIRST;

               QUERY PLAN              
----------------------------------------
 Sort
   Sort Key: actual_arrival NULLS FIRST
   ->  Seq Scan on flights
(3 rows)

如果想使用索引，必须在建立索引时，将NULL值放在索引的开头：

create index flights_nulls_first_idx on flights(actual_arrival NULLS FIRST);
explain(costs off)
select * from flights order by actual_arrival NULLS FIRST;

                     QUERY PLAN                      
-----------------------------------------------------
 Index Scan using flights_nulls_first_idx on flights
(1 row)

这一切都是由于NULL值不能参与排序导致的，也就是说，NULL与任何值比较，结果都是未定义的：

\pset null NULL
select null < 42;

 ?column?
----------
 NULL
(1 row)

这违背了B-tree的概念，也不符合通用的规则。但是，NULL值在数据库中扮演着一个非常重要的角色，因此我们经常对它特殊对待。

因为NULL会被索引，所以，表中所有的行在索引中都存在。因此，即使没有任何条件，也可以直接扫描索引。当查询所需的顺序（order by）与索引存放数据的顺序相同时，优化器可能会直接使用索引扫描，而避免额外的排序过程。

4. 属性

我们看一下B-tree索引的属性，查询语句见第二篇文章。

 amname |     name      | pg_indexam_has_property
--------+---------------+-------------------------
 B-tree  | can_order     | t
 B-tree  | can_unique    | t
 B-tree  | can_multi_col | t
 B-tree  | can_exclude   | t

可以看出，B-tree索引支持排序，并且支持唯一性。它是唯一提供这两个属性的AM。它也支持多列索引（还有其它AM支持多列索引，但不是所有AM都支持）。我们下次再讨论排它约束。

     name      | pg_index_has_property
---------------+-----------------------
 clusterable   | t
 index_scan    | t
 bitmap_scan   | t
 backward_scan | t

B-tree索引既支持索引扫描，也支持位图扫描，而且可以支持正向和反向扫描。

        name        | pg_index_column_has_property 
--------------------+------------------------------
 asc                | t
 desc               | f
 nulls_first        | f
 nulls_last         | t
 orderable          | t
 distance_orderable | f
 returnable         | t
 search_array       | t
 search_nulls       | t

前4个属性与某列的排序方式有关。本例中，值以升序（非递减）排序，NULL值在最后。也可以采用其它组合。

search_array属性表示它支持下面这种表达式（注意，数组直接被放在索引扫描的扫描条件中）：

explain(costs off)
select * from aircrafts where aircraft_code in ('733','763','773');

                           QUERY PLAN                            
-----------------------------------------------------------------
 Index Scan using aircrafts_pkey on aircrafts
   Index Cond: (aircraft_code = ANY ('{733,763,773}'::bpchar[]))
(2 rows)

returnable属性表示它可以被用作Index Only Scan。因为它直接存储了列的值（不像Hash索引一样，存Hash值），而且它还存了NULL值（表中每行数据在索引中都存在）。下面再详细讨论一下covering index的Index Only Scan。

带附属列的唯一索引

像我们之前讨论的一样，一个covering索引存储了查询所需的所有列值，几乎无需扫描基表。一个唯一索引，也可能是一个covering索引。

假设，某个查询需要给一个唯一索引加一列之后，这个索引才能成为这个查询的coving索引。但是如果直接加一列，就破坏了索引在之前列上的唯一性。这时，就需要两个索引，一个索引用来保证唯一性，另外一个索引用来当做covering索引。这当然非常低效。

PG 11支持了一个新特性，索引的INCLUDE特性，它允许在唯一索引上加一些附属的列，这些列不参与唯一性判断，仅仅被Index only scan使用。

示例如下：

\d bookings

              Table "bookings.bookings"
    Column    |           Type           | Modifiers
--------------+--------------------------+-----------
 book_ref     | character(6)             | not null
 book_date    | timestamp with time zone | not null
 total_amount | numeric(10,2)            | not null
Indexes:
    "bookings_pkey" PRIMARY KEY, btree (book_ref)
Referenced by:
    TABLE "tickets" CONSTRAINT "tickets_book_ref_fkey" FOREIGN KEY (book_ref) REFERENCES bookings(book_ref)

这张表的主键（book_ref, booking code）通过一个常规B-tree实现。我们再创建一个带附属列的唯一索引：

create unique index bookings_pkey2 on bookings(book_ref) INCLUDE (book_date);

然后用新索引代替旧索引（把这些操作放在一个事务中，使得它们同时生效）：

begin;
alter table bookings drop constraint bookings_pkey cascade;
alter table bookings add primary key using index bookings_pkey2;
alter table tickets add foreign key (book_ref) references bookings (book_ref);
 commit;

结果如下：

\d bookings

              Table "bookings.bookings"
    Column    |           Type           | Modifiers
--------------+--------------------------+-----------
 book_ref     | character(6)             | not null
 book_date    | timestamp with time zone | not null
 total_amount | numeric(10,2)            | not null
Indexes:
    "bookings_pkey2" PRIMARY KEY, btree (book_ref) INCLUDE (book_date)
Referenced by:
    TABLE "tickets" CONSTRAINT "tickets_book_ref_fkey" FOREIGN KEY (book_ref) REFERENCES bookings(book_ref)

这样，这个唯一索引就可以成为下面查询的covering索引了：

demo=# explain(costs off)
select book_ref, book_date from bookings where book_ref = '059FC4';
                    QUERY PLAN                    
--------------------------------------------------
 Index Only Scan using bookings_pkey2 on bookings
   Index Cond: (book_ref = '059FC4'::bpchar)
(2 rows)

5. 索引的创建

我们知道，对特别大的表来说，最好在建立索引之前，先把数据导入，然后再创建索引。这样，创建索引的速度更快，索引占用的空间更少。

这是因为它使用了一种比单行插入索引更高效的方式。简单来说，就是先把所有数据排序，然后把所有的数据填充到叶子节点，然后再依次建立上层节点，直到根节点。

这个过程的速度受限于内存的大小（maintenance_work_mem参数）。所以，调大这个参数有利于提高建立索引的速度。对唯一索引来说，还要额外多分配work_mem大小的内存。

5.1 比较的语义

上一篇提到，PG需要知道每种类型的数据需要关联那种hash函数。同理，PG需要知道如何对数据进行排序。这对这些操作都有用：order by、group by、meige join等。PG并不依赖操作符的名字（例如>、<、=），因为用户可以定义自己的数据类型，并且可以指定不同的名字。PG使用一个操作符族定义操作符的名字：

例如，这些操作符被bool_ops族使用：

select   amop.amopopr::regoperator as opfamily_operator,
         amop.amopstrategy
from     pg_am am,
         pg_opfamily opf,
         pg_amop amop
where    opf.opfmethod = am.oid
and      amop.amopfamily = opf.oid
and      am.amname = 'B-tree'
and      opf.opfname = 'bool_ops'
order by amopstrategy;

  opfamily_operator  | amopstrategy
---------------------+-------------- 
 <(boolean,boolean)  |            1
 <=(boolean,boolean) |            2
 =(boolean,boolean)  |            3
 >=(boolean,boolean) |            4
 >(boolean,boolean)  |            5
(5 rows)

我们看到，有5个操作符，再次强调，不要依赖它们的名字。为了说明每个操作符做何种比较操作，PG引入了策略的概念。5种策略分别描述上面5个操作符的语义：

1 — less
2 — less or equal
3 — equal
4 — greater or equal
5 — greater

某些操作符族包含很多实现同一种策略的操作符。例如，integer_ops族包含了这么多实现策略1的操作符：

select   amop.amopopr::regoperator as opfamily_operator
from     pg_am am,
         pg_opfamily opf,
         pg_amop amop
where    opf.opfmethod = am.oid
and      amop.amopfamily = opf.oid
and      am.amname = 'btree'
and      opf.opfname = 'integer_ops'
and      amop.amopstrategy = 1
order by opfamily_operator;

  opfamily_operator  
---------------------- 
 <(integer,bigint)
 <(smallint,smallint)
 <(integer,integer)
 <(bigint,bigint)
 <(bigint,integer)
 <(smallint,integer)
 <(integer,smallint)
 <(smallint,bigint)
 <(bigint,smallint)
(9 rows)

由于有这些操作符，PG可以在比较不同数据类型时避免类型转换。

5.2 支持新的数据类型

PG官方文档中提供了一个创建新的数据类型，并且创建新操作符类让它支持排序的例子。这个例子用C语义完成，这在性能关键的场景下非常有用。但这并不阻碍我们使用纯sql语言完成同样的实验，以便更好的理解比较的语义。

首先创建一个复数类型，它包含两个字段：实部和虚部。

create type complex as (re float, im float);

使用这个新的数据类型创建一张表，并且插入一些数据：

create table numbers(x complex);
insert into numbers values ((0.0, 10.0)), ((1.0, 3.0)), ((1.0, 1.0));

问题来了，如果没有数学定义的话，如何对这些复数进行排序呢？
其实，PG已经为我们定义了默认的比较操作符：

select * from numbers order by x;

   x    
--------
 (0,10)
 (1,1)
 (1,3)
(3 rows)

默认情况下，对一个复杂数据类型排序时：首先比较第一个字段，然后第二个字段，以此类推，就像字符串比较一样。但是，我们也可以定义不同的排序方式。例如，复数可以被看成是向量，我们可以按照向量的模（长度）排序。首先，根据勾股定理，定义下面的辅助函数，它可以计算出复数的模：

create function modulus(a complex) returns float as $$
    select sqrt(a.re*a.re + a.im*a.im);
$$ immutable language sql;

现在我们使用辅助函数，依次定义5个函数。

create function complex_lt(a complex, b complex) returns boolean as $$
    select modulus(a) < modulus(b);
$$ immutable language sql;

create function complex_le(a complex, b complex) returns boolean as $$
    select modulus(a) <= modulus(b);
$$ immutable language sql;

create function complex_eq(a complex, b complex) returns boolean as $$
    select modulus(a) = modulus(b);
$$ immutable language sql;

create function complex_ge(a complex, b complex) returns boolean as $$
    select modulus(a) >= modulus(b);
$$ immutable language sql;

create function complex_gt(a complex, b complex) returns boolean as $$
    select modulus(a) > modulus(b);
$$ immutable language sql;

然后，定义5个操作符。为了演示，我们故意不把它们叫做”<“或”>”，我们随意给他们起一些名字：

create operator #<#(leftarg=complex, rightarg=complex, procedure=complex_lt);
create operator #<=#(leftarg=complex, rightarg=complex, procedure=complex_le);
create operator #=#(leftarg=complex, rightarg=complex, procedure=complex_eq);
create operator #>=#(leftarg=complex, rightarg=complex, procedure=complex_ge);
create operator #>#(leftarg=complex, rightarg=complex, procedure=complex_gt);

这时，我们就可以用这些操作符比较复数了：

select (1.0,1.0)::complex #<# (1.0,3.0)::complex;

?column?
----------
 t
(1 row)

除了5个操作符之外，B-tree AM还需要另外一个函数：如果第一个参数 < 第二个参数，它必须返回-1；如果第一个参数 = 第二个参数，它必须返回0；如果第一个参数 > 第二个参数，它必须返回1。这个函数被称作support函数。其它AM可能需要定义其它support函数。

create function complex_cmp(a complex, b complex) returns integer as $$
    select case when modulus(a) < modulus(b) then -1
                when modulus(a) > modulus(b) then 1
                else 0
           end;
$$ language sql;

现在，我们可以定义一个操作符类（同名的操作符族会被自动创建）：

create operator class complex_ops
default for type complex
using B-tree as
    operator 1 #<#,
    operator 2 #<=#,
    operator 3 #=#,
    operator 4 #>=#,
    operator 5 #>#,
    function 1 complex_cmp(complex,complex);

现在，查询默认就用这个操作符类进行排序了：

select * from numbers order by x;

   x    
--------
 (1,1)
 (1,3)
 (0,10)
(3 rows)

这些函数也支持B-tree。

我们可以使用这个语句查询support函数：

select amp.amprocnum,
       amp.amproc,
       amp.amproclefttype::regtype,
       amp.amprocrighttype::regtype
from   pg_opfamily opf,
       pg_am am,
       pg_amproc amp
where  opf.opfname = 'complex_ops'
and    opf.opfmethod = am.oid
and    am.amname = 'B-tree'
and    amp.amprocfamily = opf.oid;

 amprocnum |   amproc    | amproclefttype | amprocrighttype
-----------+-------------+----------------+-----------------
         1 | complex_cmp | complex        | complex
(1 row)

6. 索引页面的物理构造

我们可以使用pageinspect插件查看B-tree页面的内容。

create extension pageinspect;

元页面：

select * from bt_metap('ticket_flights_pkey');

 magic  | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
 340322 |       2 |  164 |     2 |      164 |         2
(1 row)

请注意level列：在一百万行的表上，对两个列建立所以，索引的高度只有2（不包含根页）。

第164页（root）上的统计信息：

select type, live_items, dead_items, avg_item_size, page_size, free_size
from bt_page_stats('ticket_flights_pkey',164);

 type | live_items | dead_items | avg_item_size | page_size | free_size
------+------------+------------+---------------+-----------+-----------
 r    |         33 |          0 |            31 |      8192 |      6984
(1 row)

块中的数据：(data列, 受限于屏幕大小，将其截断，它包含了索引的数据，以16进制显示):

select itemoffset, ctid, itemlen, left(data,56) as data
from bt_page_items('ticket_flights_pkey',164) limit 5;

 itemoffset |  ctid   | itemlen |                           data                           
------------+---------+---------+----------------------------------------------------------
          1 | (3,1)   |       8 |
          2 | (163,1) |      32 | 1d 30 30 30 35 34 33 32 33 30 35 37 37 31 00 00 ff 5f 00
          3 | (323,1) |      32 | 1d 30 30 30 35 34 33 32 34 32 33 36 36 32 00 00 4f 78 00
          4 | (482,1) |      32 | 1d 30 30 30 35 34 33 32 35 33 30 38 39 33 00 00 4d 1e 00
          5 | (641,1) |      32 | 1d 30 30 30 35 34 33 32 36 35 35 37 38 35 00 00 2b 09 00
(5 rows)

第一条元组中data为空，表示负无穷大（参见PG文档）。这个页面的孩子页面分别为block 3、163、323、482、641。

可以查阅PG的文档、README和源代码，了解更多内容。

PG 10 中另外一个有用的插件是amcheck。它可以检查B-tree索引中的逻辑一致性，使得我们可以事前检测到错误。

数据库内核研究

MengQingzhong.com