What's REDO point?
REDO point is the point PostgreSQL start to recover from;that is, the location to write the XLOG record at the moment when the latest checkpoint is started.
REDO point是PostgreSQL启动恢复的起始点,也就是说REDO point是最后一次checkpoint启动时写XLOG Record的位置.
一、REDO point的存储
checkpointer进程启动时,从pg_control文件中获取Redo point并存储在内存中,在执行checkpoint时更新Redo point为当前即将写XLOG Record的位置,checkpoint执行成功后把Redo point更新到pg_control文件中.
类似的,数据库启动需要执行恢复时,从pg_control文件中获取REDO point进行恢复.
pg_control文件位于$PGDATA/global目录中,可通过命令pg_controldata查看文件中的内容.
[xdb@localhost pg111db]$ find ./ -name pg_control
./global/pg_control
[xdb@localhost pg111db]$ pg_controldata
pg_control version number: 1100
Catalog version number: 201809051
Database system identifier: 6624362124887945794
Database cluster state: in production
pg_control last modified: Thu 20 Dec 2018 03:34:05 PM CST
Latest checkpoint location: 1/48447DF0
Latest checkpoint's REDO location: 1/48447DF0
Latest checkpoint's REDO WAL file: 000000010000000100000048
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:1979
Latest checkpoint's NextOID: 25238
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 561
Latest checkpoint's oldestXID's DB: 16402
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 16402
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Thu 20 Dec 2018 03:34:05 PM CST
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: minimal
wal_log_hints setting: off
max_connections setting: 100
max_worker_processes setting: 8
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce: 90bf37566703859a557b7f20688eb944b6335b5c3d36f5530941ebf1dfa777c1
[xdb@localhost pg111db]$
其中
Latest checkpoint's REDO location: 1/48447DF0
记录的信息即为REDO point.
二、REDO point的变更
1.查看当前的REDO point
[xdb@localhost pg111db]$ pg_controldata|grep 'REDO location'
Latest checkpoint's REDO location: 1/48448150
REDO point为1/48448150.
2.执行DML操作
插入3条记录
testdb=# insert into cp values(7);
INSERT 0 1
testdb=# insert into cp values(8);
INSERT 0 1
testdb=# insert into cp values(9);
INSERT 0 1
查看1/48448150后的XLOG Record
[xdb@localhost pg_wal]$ pg_waldump -p ./ -s 1/48448150
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 1/48448150, prev 1/484480E0, desc: CHECKPOINT_ONLINE redo 1/48448150; tli 1; prev tli 1; fpw true; xid 0:1982; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
rmgr: Heap len (rec/tot): 54/ 474, tx: 1982, lsn: 1/484481C0, prev 1/48448150, desc: INSERT off 11, blkref #0: rel 1663/16402/17046 blk 0 FPW
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1982, lsn: 1/484483A0, prev 1/484481C0, desc: COMMIT 2018-12-20 16:10:17.471639 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 1983, lsn: 1/484483C8, prev 1/484483A0, desc: INSERT off 12, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1983, lsn: 1/48448408, prev 1/484483C8, desc: COMMIT 2018-12-20 16:10:20.170594 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 1984, lsn: 1/48448430, prev 1/48448408, desc: INSERT off 13, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1984, lsn: 1/48448470, prev 1/48448430, desc: COMMIT 2018-12-20 16:10:22.268365 CST
pg_waldump: FATAL: error in WAL record at 1/48448470: invalid record length at 1/48448498: wanted 24, got 0
最后一个XLOG Record记录的位置为1/48448470,加上记录大小34(十六进制为0x22),位置为1/48448492,按理论上来说,如果现在执行checkpoint,该位置为REDO point.
3.执行checkpoint
testdb=# checkpoint;
CHECKPOINT
查看pg_control文件内容
[xdb@localhost pg111db]$ pg_controldata|grep 'REDO location'
Latest checkpoint's REDO location: 1/48448498
再次查看XLOG Record记录
[xdb@localhost pg_wal]$ pg_waldump -p ./ -s 1/48448150
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 1/48448150, prev 1/484480E0, desc: CHECKPOINT_ONLINE redo 1/48448150; tli 1; prev tli 1; fpw true; xid 0:1982; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
rmgr: Heap len (rec/tot): 54/ 474, tx: 1982, lsn: 1/484481C0, prev 1/48448150, desc: INSERT off 11, blkref #0: rel 1663/16402/17046 blk 0 FPW
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1982, lsn: 1/484483A0, prev 1/484481C0, desc: COMMIT 2018-12-20 16:10:17.471639 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 1983, lsn: 1/484483C8, prev 1/484483A0, desc: INSERT off 12, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1983, lsn: 1/48448408, prev 1/484483C8, desc: COMMIT 2018-12-20 16:10:20.170594 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 1984, lsn: 1/48448430, prev 1/48448408, desc: INSERT off 13, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1984, lsn: 1/48448470, prev 1/48448430, desc: COMMIT 2018-12-20 16:10:22.268365 CST
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 1/48448498, prev 1/48448470, desc: CHECKPOINT_ONLINE redo 1/48448498; tli 1; prev tli 1; fpw true; xid 0:1985; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
pg_waldump: FATAL: error in WAL record at 1/48448498: invalid record length at 1/48448508: wanted 24, got 0
[xdb@localhost pg_wal]$
可以看到1/48448498是最后一条checkpoint记录的起始写入位置,REDO point为1/48448498,比预想的要多出6个字节(这6个字节都是0x00,用于补齐?).
4.dump WAL segment file
再次使用hexdump工具查看WAL segment file文件.
1/48448470开始的"COMMIT"记录
[xdb@localhost pg_wal]$ echo "obase=10;ibase=16;448470"|bc
4490352 --> 文件位置偏移
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490352 -n 34
00448470 22 00 00 00 c0 07 00 00 30 84 44 48 01 00 00 00 |".......0.DH....|
00448480 00 01 00 00 83 5e 7a d4 ff 08 cd 2f 4a 6e 6e 20 |.....^z..../Jnn |
00448490 02 00 |..|
00448492
XLOG Record的头部首先是XLogRecord结构体,第一个域是uint32的record的大小,即0x00000022,十进制为34(大小无异).
后续的6个字节
均为0x00
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490386 -n 6
00448492 00 00 00 00 00 00 |......|
00448498
XLOG Record for checkpoint
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490392 -n 106
00448498 6a 00 00 00 00 00 00 00 70 84 44 48 01 00 00 00 |j.......p.DH....|
004484a8 10 00 00 00 ac d2 8b 95 ff 50 98 84 44 48 01 00 |.........P..DH..|
004484b8 00 00 01 00 00 00 01 00 00 00 01 00 00 00 00 00 |................|
004484c8 00 00 c1 07 00 00 96 62 00 00 01 00 00 00 00 00 |.......b........|
004484d8 00 00 31 02 00 00 12 40 00 00 01 00 00 00 12 40 |..1....@.......@|
004484e8 00 00 ba 4e 1b 5c 00 00 00 00 00 00 00 00 00 00 |...N.\..........|
004484f8 00 00 00 00 00 00 00 00 00 00 |..........|
00448502
大小为0x0000006A,即106B(头部XLogRecord 24B + XLogRecordDataHeaderShort 2B + Checkpoint 80B ).
从FF 50为XLogRecordDataHeaderShort结构体的内容,0xFF为标志位,0x50为Data的大小(即80B).
checkpoint记录的内容详见Checkpoint结构体,该结构体第一个域字段为8个字节的LSN-->0x00000001 48448498,
即REDO point:1/48448498.
三、参考资料
PostgreSQL 源码解读(109)- WAL#5(相关数据结构)
PostgreSQL DBA(15) - WAL文件结构
PostgreSQL DBA(16) - WAL segment file内部结构
PostgreSQL DBA(17) - XLOG Record data内部结构
PostgreSQL DBA(18) - pg_waldump工具简介
网友评论