pg_rman用法

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了pg_rman用法相关的知识,希望对你有一定的参考价值。

参考技术A 简介
pg_rman的用法
pg_rman使用前提

pg_rman是一款专门为postgresql设计的在线备份恢复的工具。其支持在线(热备份)和基于时间点备份方式。
postgresql10以上版本都是自带pg_rman工具的,如果没有请单独安装。

此处无需记忆,对文章后面使用的参数不理解时,再回来看。

1、创建备份目录

2、设置环境变量

3、修改postgresql.conf配置文件

3、pg_rman init 初始化

1、备份

2、校验备份集
重点注意:pg_rman 的备份必须都是经过验证过的,否则不能进行恢复和增量备份。

3、pg_rman 列出备份集

查看生成的备份文件所在目录

重点注意:
增量备份是基于文件系统的update time时间线.
增量备份前提:
    - 必须要有个对应的全库备份。
    - 当全库备份后需要验证备份集。
1、验证备份集
如上,我们已得到一个全库备份。因此只需要从验证备份开始。

2、备份

3、再次校验备份集

4、pg_rman 列出备份集

删除备份有两种方式
1、直接在fullback文件夹里面删除对应时间点的备份

2、使用 pg_rman delete -f "时间点"  删除。再删除增量备份的同时,自动会将全量备份也删除掉。如下

重点注意:恢复时需要先停库。

原地覆盖式恢复
pg_rman restore -B /postgresql-backup/backups/ --recovery-target-time "2020-04-16 13:18:32" --hard-copy
  --如果不指定recovery-target-time,则恢复到最新时间
  --如果不指定hard-copy,则归档日志目录里的归档日志是使用的硬连接指向备份目录中的归档日志,加了这个参数的话,则是直接把备份目录中的归档日志拷贝到归档日志目录

1、创建新的data目录,并修改权限

2、修改postgres用户的环境变量

3、恢复

pg_rman备份恢复测试

环境描述

1.OS

CentOS Linux release 7.2.1511 (Core) X64

2.PostgreSQL

PostgreSQL 9.6.1

3.pg_rman

pg_rman-1.3.3-pg96.tar.gz v1.3.3

注意:请下载版本对应的源码包。

https://github.com/ossc-db/pg_rman/releases/download/v1.3.3/pg_rman-1.3.3-pg96.tar.gz

pg_rman-1.3.3.tar.gz(此源码编译过程中报错)

系统包

zlib-devel


二、pg_rman安装

1.安装pg_rman

root用户登录

export PATH=/opt/pgsql/9.6.1/bin:$PATH

export LD_LIBRARY_PATH=/opt/pgsql/9.6.1/lib

export MANPATH=/opt/pgsql/9.6.1/share/man:$MANPATH


# tar zxvf pg_rman-9_6_STABLE.tar.gz

# cd pg_rman-9_6_STABLE/

# make 

......

......

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 backup.o catalog.o data.o delete.o dir.o init.o parray.o pg_rman.o restore.o show.o util.o validate.o xlog.o pgsql_src/pg_ctl.o pgut/pgut.o pgut/pgut-port.o -L/opt/pgsql/9.6.1/lib -lpgcommon -lpgport -L/opt/pgsql/9.6.1/lib -lpq -L/opt/pgsql/9.6.1/lib -Wl,--as-needed -Wl,-rpath,‘/opt/pgsql/9.6.1/lib‘,--enable-new-dtags  -lpgcommon -lpgport -lz -lreadline -lrt -lcrypt -ldl -lm -o pg_rman

# make install

/usr/bin/mkdir -p ‘/opt/pgsql/9.6.1/bin‘

/usr/bin/install -c  pg_rman ‘/opt/pgsql/9.6.1/bin‘

2.安装验证

su - postgres

$ pg_rman --version

pg_rman 1.3.3


3.配置数据库参数

wal_level = replica

archive_mode = on

archive_command = ‘test ! -f /pg_arclog/%f && cp %p /pg_arclog/%f‘

--- root user

mkdir /backup_pg_rman /pg_arclog 

chown -R postgres:postgres /backup_pg_rman

chown -R postgres:postgres /pg_arclog

--- postgresql

# pg_rman init -B $backup_dir 


三、备份恢复测试

1.备份数据(full<0> + incremental<1>)

# full

export PGDATA=/pgdata96

export BACKUP_PATH=/backup_pg_rman


$ echo $PGDATA

/pgdata96

$ echo $BACKUP_PATH

/backup_pg_rman

--- init backup dir: pg_rman init -B $backup_dir -D $PGDATA(当不配置环境变量时,手工指定,注意路径末尾不添加‘/‘结束符)

$ pg_rman init

INFO: ARCLOG_PATH is set to ‘/pg_arclog‘

INFO: SRVLOG_PATH is set to ‘/pgdata96/pg_log‘

$  


$ cat $BACKUP_PATH/pg_rman.ini

ARCLOG_PATH=‘/pg_arclog‘

SRVLOG_PATH=‘/pgdata96/pg_log‘


--- full backup

$ pg_rman backup --backup-mode=full --with-serverlog --progress

INFO: copying database files

Processed 1172 of 1172 files, skipped 0

INFO: copying archived WAL files

Processed 3 of 3 files, skipped 0

INFO: copying server log files

Processed 4 of 4 files, skipped 0

INFO: backup complete

INFO: Please execute ‘pg_rman validate‘ to verify the files are correctly copied.


--- validate backup

$ pg_rman validate, status: done

INFO: validate: "2017-03-06 16:43:39" backup, archive log files and server log files by CRC

INFO: backup "2017-03-06 16:43:39" is valid


--- show backup, status: ok 

$ pg_rman show

==========================================================

 StartTime           Mode  Duration    Size   TLI  Status 

==========================================================

2017-03-06 16:43:39  FULL        0m    58MB     1  OK

--- incremental

$ pg_rman backup --backup-mode=incremental --with-serverlog --progress

INFO: copying database files

Processed 1172 of 1172 files, skipped 1115

INFO: copying archived WAL files

Processed 48 of 48 files, skipped 3

INFO: copying server log files

Processed 4 of 4 files, skipped 3

INFO: backup complete

INFO: Please execute ‘pg_rman validate‘ to verify the files are correctly copied.

--- validate backup

$ pg_rman validate

INFO: validate: "2017-03-06 17:04:45" backup, archive log files and server log files by CRC

INFO: backup "2017-03-06 17:04:45" is valid

--- show, status: ok

$ pg_rman show detail

============================================================================================================

 StartTime           Mode  Duration    Data  ArcLog  SrvLog   Total  Compressed  CurTLI  ParentTLI  Status  

============================================================================================================

2017-03-06 17:04:45  INCR        0m   401MB   738MB    27kB  1136MB       false       1          0  OK

2017-03-06 16:43:39  FULL        0m    30MB    33MB   206kB    58MB       false       1          0  OK


2.模拟灾难恢复


1).删除PGDATA 目录下所有文件

安全停止数据库,删除文件

$ pg_ctl stop -m immediate -D /pgdata96/

$ cd /pgdata96

$ rm -rf *.*


2).恢复备份

--- postgres user

$ export PGDATA=/pgdata96

$ export BACKUP_PATH=/backup_pg_rman

$ pg_rman restore

WARNING: pg_controldata file "/pgdata96/global/pg_control" does not exist

WARNING: pg_controldata file "/pgdata96/global/pg_control" does not exist

INFO: the recovery target timeline ID is not given

INFO: use timeline ID of latest full backup as recovery target: 1

INFO: calculating timeline branches to be used to recovery target point

INFO: searching latest full backup which can be used as restore start point

INFO: found the full backup can be used as base in recovery: "2017-03-06 16:43:39"

INFO: copying online WAL files and server log files

INFO: clearing restore destination

INFO: validate: "2017-03-06 16:43:39" backup, archive log files and server log files by SIZE

INFO: backup "2017-03-06 16:43:39" is valid

INFO: restoring database files from the full mode backup "2017-03-06 16:43:39"

INFO: searching incremental backup to be restored

INFO: validate: "2017-03-06 17:04:45" backup, archive log files and server log files by SIZE

INFO: backup "2017-03-06 17:04:45" is valid

INFO: restoring database files from the incremental mode backup "2017-03-06 17:04:45"

INFO: searching backup which contained archived WAL files to be restored

INFO: backup "2017-03-06 17:04:45" is valid

INFO: restoring WAL files from backup "2017-03-06 17:04:45"

INFO: restoring online WAL files and server log files

INFO: generating recovery.conf

INFO: restore complete

HINT: Recovery will start automatically when the PostgreSQL server is started.


3).启动数据库验证数据

# /etc/init.d/postgresql start

Starting PostgreSQL: ok

切换至postgres用户,然后验证数据



异常停止数据恢复

描述:当数据库没有成功执行检查点完成,恢复时可能会丢失数据,错误排查

现象:启动数据库失败时

$ more postgresql-Mon.log 

2017-03-06 17:20:47 CST [3240]: [1-1] user=,db= LOG:  database system was interrupted; last known up at 2017-03-06 17:04:51 CST

2017-03-06 17:20:47 CST [3240]: [2-1] user=,db= LOG:  starting archive recovery

2017-03-06 17:20:47 CST [3240]: [3-1] user=,db= LOG:  invalid primary checkpoint record

2017-03-06 17:20:47 CST [3240]: [4-1] user=,db= LOG:  invalid secondary checkpoint record

2017-03-06 17:20:47 CST [3240]: [5-1] user=,db= PANIC:  could not locate a valid checkpoint record

2017-03-06 17:20:47 CST [3238]: [3-1] user=,db= LOG:  startup process (PID 3240) was terminated by signal 6: Aborted

2017-03-06 17:20:47 CST [3238]: [4-1] user=,db= LOG:  aborting startup due to startup process failure

2017-03-06 17:20:47 CST [3238]: [5-1] user=,db= LOG:  database system is shut down

2017-03-06 17:21:23 CST [3269]: [1-1] user=,db= LOG:  database system was interrupted; last known up at 2017-03-06 17:04:51 CST

2017-03-06 17:21:23 CST [3269]: [2-1] user=,db= LOG:  starting archive recovery

2017-03-06 17:21:23 CST [3269]: [3-1] user=,db= LOG:  invalid primary checkpoint record

2017-03-06 17:21:23 CST [3269]: [4-1] user=,db= LOG:  invalid secondary checkpoint record

2017-03-06 17:21:23 CST [3269]: [5-1] user=,db= PANIC:  could not locate a valid checkpoint record

2017-03-06 17:21:23 CST [3267]: [3-1] user=,db= LOG:  startup process (PID 3269) was terminated by signal 6: Aborted

2017-03-06 17:21:23 CST [3267]: [4-1] user=,db= LOG:  aborting startup due to startup process failure

2017-03-06 17:21:23 CST [3267]: [5-1] user=,db= LOG:  database system is shut down


处理步骤说明:

重置事务日志

仅保留备份时数据

$ pg_resetxlog -f /pgdata96

Transaction log reset

然后启动数据库,验证部分数据


本文出自 “yiyi” 博客,请务必保留此出处http://heyiyi.blog.51cto.com/205455/1903709

以上是关于pg_rman用法的主要内容,如果未能解决你的问题,请参考以下文章

PostgreSQL备份恢复-pg_rman

pg_rman的安装与使用

pg_rman备份恢复测试

POSTGRESQL9.5之pg_rman工具

PG_RMAN备份遇到 domain socket

PG_RMAN使用手册