本文為墨天輪數據庫管理服務團隊第137期技術分享,內容原創,作者為技術顧問羅海鷗,如需轉載請聯繫小墨(VX:modb666)並註明來源。如需查看更多文章可關注【墨天輪】公眾號。
適用範圍
PG:ALL
方案概述
PG數據庫發生誤操作,造成數據丟失後如何恢復數據。
實施步驟
一、實驗環境
| column1 | 源庫 | 目標庫 |
|---|---|---|
| IP地址 | 192.168.65.128 | 192.168.65.129 |
| 主機名 | luo | pg2 |
| 版本 | 17 | 17 |
源庫需要提前打開如下參數:
db5=# select name,setting from pg_settings where name in ('wal_level','archive_mode','archive_command','restore_command');
name | setting
-----------------+-------------------------
archive_command | cp %p /opt/pgwal/arc/%f
archive_mode | on
restore_command | cp /opt/pgwal/arc/%f %p
wal_level | replica
(4 rows)
二、在目標庫服務器上對源庫做一個備份。
[postgres@pg2 opt]$ pg_basebackup -Fp -Xs -P -v -D /opt/pgdata -h luo
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2C000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_13727"
55973/55973 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2C000158
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
三、源庫模擬生產,寫入數據
postgres=# create database db5; --新建一個數據庫db5
CREATE DATABASE
postgres=#
postgres=#
postgres=#
postgres=# \c db5
You are now connected to database "db5" as user "postgres".
db5=# create table t1 as select * from pg_class;
SELECT 415
db5=#
db5=#
db5=# create table t2 (id int);
CREATE TABLE
db5=# insert into t2 values(1);
INSERT 0 1
db5=# insert into t2 values(2);
INSERT 0 1
db5=#
db5=#
db5=#
db5=# select * from t2;
id
----
1
2
(2 rows)
db5=# delete from t2 where id=2; --模擬誤操作,刪除t2表上一行數據
DELETE 1
db5=# select * from t2;
id
----
1
(1 row)
db5=# create table t3(id int); --新建t3表
CREATE TABLE
db5=# checkpoint ;
CHECKPOINT
db5=# select pg_switch_wal(); --切換WAL日誌
pg_switch_wal
---------------
0/2D46A8A0
(1 row)
db5=#
db5=# select pg_relation_filepath('t2'); --查看t2表文件名,文件名是16485
pg_relation_filepath
----------------------
base/16479/16485
(1 row)
db5=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | t1 | table | postgres
public | t2 | table | postgres
public | t3 | table | postgres
(3 rows)
db5=# select * from t2;
id
----
1
(1 row)
四、源庫上pg_waldump分析wal日誌,定位刪除的具體lsn
[postgres@luo arc]$
[postgres@luo arc]$ pwd --歸檔目錄
/opt/pgwal/arc
[postgres@luo arc]$ ls -lrt --根據時間列出WAL日誌
總用量 65540
-rw------- 1 postgres dba 16777216 8月 27 16:08 00000001000000000000002A
-rw------- 1 postgres dba 16777216 8月 27 16:10 00000001000000000000002B
-rw------- 1 postgres dba 16777216 8月 27 16:10 00000001000000000000002C
-rw------- 1 postgres dba 341 8月 27 16:10 00000001000000000000002C.00000028.backup
-rw------- 1 postgres dba 16777216 8月 27 16:13 00000001000000000000002D
[postgres@luo arc]$
WAL日誌定位到如下關鍵信息:誤操作是DELETE,文件名是16485,XID=833,lsn是0/2D469640,上一個lsn是0/2D469608。
[postgres@luo arc]$ pg_waldump 00000001000000000000002D|grep -i delete|grep 16485
rmgr: Heap len (rec/tot): 54/ 54, tx: 833, lsn: 0/2D469640, prev 0/2D469608, desc: DELETE xmax: 833, off: 2, infobits: [KEYS_UPDATED], flags: 0x00, blkref
#0
: rel 1663/16479/16485 blk 0
[postgres@luo arc]$
五、目標庫開始恢復數據
1.配置recovery\_target\_lsn參數
vi postgresql.conf
recovery_target_lsn = '0/2D469608' --lsn為上一個lsn
2.創建standby.signal文件
[postgres@pg2 pgdata]$ cd /opt/pgdata/
[postgres@pg2 pgdata]$ touch standby.signal
[postgres@pg2 pgdata]$
[postgres@pg2 pgdata]$ ll
總用量 388
-rw------- 1 postgres dba 227 8月 27 16:10 backup_label.old
-rw------- 1 postgres dba 318484 8月 27 16:10 backup_manifest
drwx------ 6 postgres dba 46 8月 27 16:18 base
-rw------- 1 postgres dba 47 8月 27 16:17 current_logfiles
drwx------ 2 postgres dba 4096 8月 27 16:18 global
drwx------ 2 postgres dba 6 8月 27 16:10 pg_commit_ts
drwx------ 2 postgres dba 6 8月 27 16:10 pg_dynshmem
-rw------- 1 postgres dba 5867 8月 27 16:10 pg_hba.conf
-rw------- 1 postgres dba 2640 8月 27 16:10 pg_ident.conf
drwx------ 2 postgres dba 4096 8月 27 16:17 pg_log
drwx------ 4 postgres dba 68 8月 27 16:22 pg_logical
drwx------ 4 postgres dba 36 8月 27 16:10 pg_multixact
drwx------ 2 postgres dba 6 8月 27 16:10 pg_notify
drwx------ 2 postgres dba 6 8月 27 16:10 pg_replslot
drwx------ 2 postgres dba 6 8月 27 16:10 pg_serial
drwx------ 2 postgres dba 6 8月 27 16:10 pg_snapshots
drwx------ 2 postgres dba 6 8月 27 16:10 pg_stat
drwx------ 2 postgres dba 6 8月 27 16:10 pg_stat_tmp
drwx------ 2 postgres dba 18 8月 27 16:22 pg_subtrans
drwx------ 2 postgres dba 6 8月 27 16:10 pg_tblspc
drwx------ 2 postgres dba 6 8月 27 16:10 pg_twophase
-rw------- 1 postgres dba 3 8月 27 16:10 PG_VERSION
drwx------ 4 postgres dba 77 8月 27 16:24 pg_wal
drwx------ 2 postgres dba 18 8月 27 16:10 pg_xact
-rw------- 1 postgres dba 88 8月 27 16:10 postgresql.auto.conf
-rw------- 1 postgres dba 30969 8月 27 16:17 postgresql.conf
-rw------- 1 postgres dba 23 8月 27 16:17 postmaster.opts
-rw------- 1 postgres dba 70 8月 27 16:17 postmaster.pid
-rw-r--r-- 1 postgres dba 0 8月 27 16:11 standby.signal
[postgres@pg2 pgdata]$
打開目標庫
[postgres@pg2 pgdata]$ pg_ctl start
waiting for server to start....2025-08-27 16:17:42.609 CST [14902] :[ [txid=0] user= ,db=,app=,client=LOG: redirecting log output to logging collector process
2025-08-27 16:17:42.609 CST [14902] :[ [txid=0] user= ,db=,app=,client=HINT: Future log output will appear in directory "pg_log".
done
server started
目標庫缺少最新WAL日誌。
2025-08-27 17:17:40.417 CST [15577] :[ [txid=0] user= ,db=,app=,client=LOG: database system is ready to accept read-only connections
cp: 無法獲取"/opt/pgwal/arc/00000001000000000000002D" 的文件狀態(stat): 沒有那個文件或目錄
從源庫傳輸過去
[postgres@luo arc]$ scp 00000001000000000000002D pg2:/opt/pgwal/arc
postgres@pg2's password:
00000001000000000000002D 100% 16MB 13.9MB/s 00:01
[postgres@luo arc]$
目標庫不再報錯了,recovery stopping after WAL location (LSN) “0/2D469608”,説明數據庫恢復到我們指定的lsn了,pausing at the end of recovery,數據庫處於暫停恢復狀態。
cp: 無法獲取"/opt/pgwal/arc/00000002.history" 的文件狀態(stat): 沒有那個文件或目錄
2025-08-27 17:20:00.706 CST [15581] :[ [txid=0] user= ,db=,app=,client=LOG: waiting for WAL to become available at 0/2D000018
2025-08-27 17:20:05.694 CST [15581] :[ [txid=0] user= ,db=,app=,client=LOG: restored log file "00000001000000000000002D" from archive
cp: 無法獲取"/opt/pgwal/arc/00000001000000000000002E" 的文件狀態(stat): 沒有那個文件或目錄
2025-08-27 17:20:05.740 CST [15581] :[ [txid=0] user= ,db=,app=,client=LOG: recovery stopping after WAL location (LSN) "0/2D469608"
2025-08-27 17:20:05.741 CST [15581] :[ [txid=0] user= ,db=,app=,client=LOG: pausing at the end of recovery
2025-08-27 17:20:05.741 CST [15581] :[ [txid=0] user= ,db=,app=,client=HINT: Execute pg_wal_replay_resume() to promote.
六、登錄數據庫,檢查恢復是否符合預期。
源庫
[postgres@luo ~]$ psql db5
psql (17.0)
Type "help" for help.
db5=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | t1 | table | postgres
public | t2 | table | postgres
public | t3 | table | postgres
(3 rows)
db5=# select * from t2;
id
----
1
(1 row)
db5=#
目標庫
[postgres@pg2 pgdata]$ psql db5
psql (17.0)
Type "help" for help.
db5=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | t1 | table | postgres
public | t2 | table | postgres
(2 rows)
db5=# select * from t2;
id
----
1
2
(2 rows)
db5=#
目標庫基於備份和WAL日誌恢復已完成。
墨天輪從樂知樂享的數據庫技術社區蓄勢出發,全面升級,提供多類型數據庫管理服務。墨天輪數據庫管理服務旨在為用户構建信賴可託付的數據庫環境,併為數據庫廠商提供中立的生態支持。
墨天輪數據庫服務官網:https://www.modb.pro/service