KingbaseES V8R3备份恢复案例之---sys_rman物理备份异机恢复

Posted 天涯客1224

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了KingbaseES V8R3备份恢复案例之---sys_rman物理备份异机恢复相关的知识,希望对你有一定的参考价值。

KingbaseES、备份恢复

案例说明:
在生产环境通过sys_rman执行了物理备份后,需要在异机构建测试环境,本案例描述了通过物理备份异机恢复的详细过程及操作。

适用版本:
KingbaseES V8R3

节点信息:

[kingbase@node102 bin]$ cat /etc/hosts
......
192.168.1.101   node101    # 生产节点
192.168.1.102   node102    # 测试节点

一、生产库执行sys_rman物理备份

1、生产环境相关配置参数

# 开启归档
test=# show archive_mode ;
 archive_mode
--------------
 on
(1 row)

# 归档文件存储路径
test=# show archive_dest ;
       archive_dest
--------------------------
 /data/kingbase/arch/c290
(1 row)

# 归档配置
test=# show archive_command ;
                              archive_command
----------------------------------------------------------------------------
 test ! -f /data/kingbase/arch/c290/%f && cp %p /data/kingbase/arch/c290/%f
(1 row)

# wal日志配置
test=# show wal_level ;
 wal_level
-----------
 replica
(1 row)

2、执行sys_rman物理备份

1)备份初始化

[kingbase@node101 ~]$ mkdir -p /data/kingbase/bk/c290
[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ init

2)执行数据库全备

[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ -b full backup
INFO: validate: RTATKP backup and archive log files by CRC

[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ validate
INFO: validate: RTATKP backup and archive log files by CRC
INFO: backup validation completed successfully

3)执行增量备份

# 事务操作
prod=# create table t2 as select * from t1;
SELECT 10000

prod=# select count(*) from t2;
 count
-------
 10000
(1 row)

# 生成检查点(在恢复时,缩短recovery时间)。
prod=# select sys_switch_xlog();
 sys_switch_xlog
-----------------
 0/70000A0
(1 row)

prod=# checkpoint;
CHECKPOINT

# 执行正增量备份
[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ -b page backup
INFO: validate: RTATU1 backup and archive log files by CRC

4) 查看备份信息

[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ show
==========================================================================================================
ID       Recovery time        Mode          Current/Parent TLI  Time            Data  start_lsn  stop_lsn Status
==========================================================================================================
RTATU1   2023-04-18 14:54:03  PAGE           1 / 0              2s           628kB  0/9000028  0/A000078  OK
RTATKP   2023-04-18 14:48:27  FULL           1 / 0              2s            80MB  0/3000028  0/3000130  OK

5)查看备份文件信息

[kingbase@node101 c290]$ ls -lh
total 4.0K
drwx------ 4 kingbase kingbase 32 Apr 18 14:54 backups
-rw-r--r-- 1 kingbase kingbase 41 Apr 18 14:47 sys_rman.conf
lrwxrwxrwx 1 kingbase kingbase 25 Apr 18 15:40 wal -> /data/kingbase/arch/c290/

二、sys_rman执行异机恢复

Tips:
物理备份的恢复一般分为两个步骤

restore: 还原备份数据文件到data目录下
reocovery: 启动实例后从最近的检查点开始应用xlog日志到一致性状态后,开启数据库。

1、准备数据库环境

  1)在测试主机安装和生产主机相同的数据库版本
  2)创建相同的备份存储路径和xlog日志归档路径
  3)归档及wal日志配置和生产库相同

2、复制生产库备份到测试主机
[kingbase@node101 c290]$ scp -r * node102:/data/kingbase/bk/c290/

3、执行sys_rman恢复

1)restore备份到data目录下

# 备份测试库data目录
[kingbase@node102 c290]$ cd /opt/Kingbase/ES/C290/
[kingbase@node102 C290]$ mv data data.bk

# 创建data目录并授权
[kingbase@node102 bin]$ mkdir -p /opt/Kingbase/ES/C290/data
[kingbase@node102 bin]$ chmod 700 /opt/Kingbase/ES/C290/data

# 在测试库上查看备份信息
[kingbase@node102 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ show
==========================================================================================================
ID       Recovery time        Mode          Current/Parent TLI  Time            Data  start_lsn  stop_lsn Status
==========================================================================================================
RTATU1   2023-04-18 14:54:03  PAGE           1 / 0              2s           628kB  0/9000028  0/A000078  OK
RTATKP   2023-04-18 14:48:27  FULL           1 / 0              2s            80MB  0/3000028  0/3000130  OK

# 执行sys_rman restore
[kingbase@node102 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ restore
INFO: validate: RTATKP backup and archive log files by SIZE
INFO: validate: RTATU1 backup and archive log files by SIZE
INFO: restore complete. Recovery starts automatically when the Kingbase server is started.

如下图所示,执行restore:

2)启动测试库实例执行recovery

# 启动数据库实例
[kingbase@node102 bin]$ ./sys_ctl start -D ../../data
server starting
.......

# 查看sys_log日志
[kingbase@node102 sys_log]$ tail -1000 kingbase-2023-04-18_154713.log
2023-04-18 15:47:13 CST LOG:  database system was interrupted; last known up at 2023-04-18 14:54:01 CST
2023-04-18 15:47:13 CST LOG:  creating missing WAL directory "sys_xlog/archive_status"
2023-04-18 15:47:13 CST LOG:  starting archive recovery
2023-04-18 15:47:13 CST LOG:  restored log file "000000010000000000000009" from archive
2023-04-18 15:47:13 CST LOG:  redo starts at 0/9000028
2023-04-18 15:47:13 CST LOG:  redo wal segment count 1
2023-04-18 15:47:13 CST LOG:  restored log file "00000001000000000000000A" from archive
2023-04-18 15:47:13 CST LOG:  consistent recovery state reached at 0/A000078
2023-04-18 15:47:13 CST LOG:  restored log file "00000001000000000000000B" from archive
2023-04-18 15:47:13 CST LOG:  restored log file "00000001000000000000000C" from archive
cp: cannot stat ‘/data/kingbase/bk/c290//wal/00000001000000000000000D’: No such file or directory
2023-04-18 15:47:13 CST LOG:  complete: 1/1
2023-04-18 15:47:13 CST LOG:  redo done at 0/C0000D0
2023-04-18 15:47:13 CST LOG:  last completed transaction was at log time 2023-04-18 14:54:03.704661+08
2023-04-18 15:47:13 CST LOG:  restored log file "00000001000000000000000C" from archive
cp: cannot stat ‘/data/kingbase/bk/c290//wal/00000002.history’: No such file or directory
2023-04-18 15:47:13 CST LOG:  selected new timeline ID: 2
2023-04-18 15:47:13 CST LOG:  archive recovery complete
cp: cannot stat ‘/data/kingbase/bk/c290//wal/00000001.history’: No such file or directory
2023-04-18 15:47:13 CST LOG:  MultiXact member wraparound protections are now enabled
2023-04-18 15:47:13 CST LOG:  autovacuum launcher started
2023-04-18 15:47:13 CST LOG:  database system is ready to accept connections
2023-04-18 15:47:13 CST LOG:  starting syslogical supervisor
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database TEST
2023-04-18 15:47:13 CST LOG:  manager worker [11755] at slot 0 generation 1 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database TEMPLATE1
2023-04-18 15:47:13 CST LOG:  manager worker [11757] at slot 0 generation 2 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database TEMPLATE2
2023-04-18 15:47:13 CST LOG:  manager worker [11758] at slot 0 generation 3 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database SAMPLES
2023-04-18 15:47:13 CST LOG:  manager worker [11759] at slot 0 generation 4 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database SECURITY
2023-04-18 15:47:13 CST LOG:  manager worker [11760] at slot 0 generation 5 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database prod
2023-04-18 15:47:13 CST LOG:  manager worker [11761] at slot 0 generation 6 detaching cleanly

如下图所示,数据库执行reocvery操作:

三、测试库连接访问

[kingbase@node102 bin]$ ./ksql -U system -W 123456 test
ksql (V008R003C002B0290)
Type "help" for help.

test=# \\l
                               List of databases
   Name    | Owner  | Encoding |   Collate   |    Ctype    | Access privileges
-----------+--------+----------+-------------+-------------+--------------------
 prod      | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
 SAMPLES   | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
 SECURITY  | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
 TEMPLATE0 | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/SYSTEM         +
           |        |          |             |             | SYSTEM=CTcb/SYSTEM
 TEMPLATE1 | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/SYSTEM         +
           |        |          |             |             | SYSTEM=CTcb/SYSTEM
 TEMPLATE2 | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =Tc/SYSTEM        +
           |        |          |             |             | SYSTEM=CTcb/SYSTEM
 TEST      | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
(7 rows)

test=# \\c prod
You are now connected to database "prod" as user "system".
prod=# \\d
                    List of relations
 Schema |             Name              | Type  | Owner
--------+-------------------------------+-------+--------
 PUBLIC | pathman_cache_stats           | view  | SYSTEM
 PUBLIC | pathman_concurrent_part_tasks | view  | SYSTEM
 PUBLIC | pathman_config                | table | SYSTEM
 PUBLIC | pathman_config_params         | table | SYSTEM
 PUBLIC | pathman_partition_list        | view  | SYSTEM
 PUBLIC | t1                            | table | SYSTEM
 PUBLIC | t2                            | table | SYSTEM
(7 rows)

prod=# select count(*) from t1;
 count
-------
 10000
(1 row)

prod=# select count(*) from t2;
 count
-------
 10000
(1 row)

---如上所示,测试库数据恢复到了最近的备份点。

三、总结
sys_rman物理备份支持异机恢复,操作过程相对比较简单;可以将生产库的备份目录建立nfs共享,然后在测试环境mount共享文件系统,不用再从生产主机将备份拷贝到测试主机。

KingbaseES R3集群备库执行sys_backup.sh物理备份案例

案例说明:
KingbaseES R3的后期版本支持通过sys_backup.sh执行sys_rman的物理备份,实际上是调用了sys_rman_v6的工具做物理备份。本案例是在备库上执行集群的备份,repo目录在备库上,采用cluster模式备份。

数据库版本:

TEST=# select version();
                                                         VERSION                                                         
-------------------------------------------------------------------------------------------------------------------------
 Kingbase V008R003C002B0270 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

集群架构:

# 节点信息
[kingbase@node1 ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.7.248   node1      # 备库
192.168.7.249   node2
192.168.7.243   node3      # 主库

[kingbase@node3 bin]$ ./ksql -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0270)
Type "help" for help.

TEST=# show pool_nodes;
 node_id |   hostname    | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.7.243 | 54321 | up     | 0.500000  | primary | 0          | true              | 0
 1       | 192.168.7.248 | 54321 | up     | 0.500000  | standby | 0          | false             | 0
(2 rows)

TEST=# select * from sys_stat_replication;
 PID  | USESYSID | USENAME | APPLICATION_NAME |  CLIENT_ADDR  | CLIENT_HOSTNAME | CLIENT_PORT |         BACKEND_START         | BACKEND_XMIN |   STATE   | SENT_LOCATION | WRITE_LOCATION | FLUSH_LOCATION | REPLAY_LOCATION | SYNC_PRIORITY | SYNC_STATE 
------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+--------------+---
 4222 |       10 | SYSTEM  | node248          | 192.168.7.248 |                 |       30369 | 2021-03-01 12:16:20.511199+08 |              | streaming | 0/9000178     | 0/9000178      | 0/9000178      | 0/9000178       |             2 | sync
(1 row)

一、查看集群主机归档配置

=== 由以下可知, KingbaseES R3新版本的备库wal日志在切换时,也会产生归档日志。===

1、主库归档配置

[kingbase@node3 bin]$ cat ../data/kingbase.conf |grep archive_command
archive_command=\'/home/kingbase/cluster/kha/db/bin/sys_rman_v6 --config /home/kingbase/kbbr3_repo/sys_rman_v6.conf --stanza=kingbase archive-push %p\'
                                # ! waring: if set archive_dest,  ignore archive_command.

2、备库归档配置

[kingbase@node1 bin]$ cat ../data/kingbase.conf|grep -i archive_command
archive_command=\'/home/kingbase/cluster/kha/db/bin/sys_rman_v6 --config /home/kingbase/kbbr3_repo/sys_rman_v6.conf --stanza=kingbase archive-push %p\'
                                # ! waring: if set archive_dest,  ignore archive_command.

二、在备库配置sys_backup.conf文件

[kingbase@node1 bin]$ cat sys_backup.conf |grep -v ^#|grep -v ^$
_target_db_
_one_db_ip="192.168.7.243" 
_repo_ip="192.168.7.248"
_stanza_name="kingbase"  
_os_user_name="kingbase" 
_repo_path="/home/kingbase/kbbr3_repo"
_repo_retention_full_count=5 
_crond_full_days=7  
_crond_diff_days=0 
_crond_incr_days=1
_crond_full_hour=2 
_crond_diff_hour=3 
_crond_incr_hour=4 
_os_ip_cmd="/sbin/ip"
_os_rm_cmd="/bin/rm"
_os_sed_cmd="/bin/sed"
_os_grep_cmd="/bin/grep"
_single_data_dir="/home/kingbase/ES/V8_single/data"
_single_bin_dir="/home/kingbase/ES/V8_single/Server/bin"
_single_db_user="system"
_single_db_port="54321"
_kb_pass="S0lOR0JBU0VBRE1JTg=="

二、执行备份初始化

[kingbase@node1 bin]$ ./sys_backup.sh init
# generate local sys_rman_v6.conf...DONE
# update all node: sys_rman_v6.conf and archive_command with sys_rman_v6.archive-push...
# update all node: sys_rman_v6.conf and archive_command with sys_rman_v6.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)
# create stanza and check...DONE
# initial first full backup...(maybe several minutes)
# initial first full backup...DONE
# Initial sys_rman_v6 OK.
\'sys_backup.sh start\' should be executed when need back-rest feature.

三、查看repo目录及备份信息

[kingbase@node1 ~]$ cd kbbr3_repo/
[kingbase@node1 kbbr3_repo]$ ls -lh
total 4.0K
drwxr-x--- 3 kingbase kingbase  21 Mar  1 12:26 archive
drwxr-x--- 3 kingbase kingbase  21 Mar  1 12:26 backup
-rw-rw-r-- 1 kingbase kingbase 589 Mar  1 12:26 sys_rman_v6.conf
[kingbase@node1 kbbr3_repo]$ ls -lh archive/
total 0
drwxr-x--- 3 kingbase kingbase 61 Mar  1 12:26 kingbase
[kingbase@node1 kbbr3_repo]$ ls -lh backup/
total 0
drwxr-x--- 4 kingbase kingbase 104 Mar  1 12:29 kingbase
[kingbase@node1 kbbr3_repo]$ ls -lh backup/kingbase/
total 8.0K
drwxr-x--- 3 kingbase kingbase   69 Mar  1 12:29 20210301-122305F
drwxr-x--- 3 kingbase kingbase   17 Mar  1 12:29 backup.history
-rw-r----- 1 kingbase kingbase 1.1K Mar  1 12:29 backup.info
-rw-r----- 1 kingbase kingbase 1.1K Mar  1 12:29 backup.info.copy
lrwxrwxrwx 1 kingbase kingbase   16 Mar  1 12:29 latest -> 20210301-122305F

四、测试日志切换归档

=== 由以下测试可以获知,在KingbaseES R3的新版本中,当主库日志发生切换时,会对wal日志执行归档;同时备库wal日志也发生切换,并归档。===

1、主库执行日志切换

TEST=#  select sys_switch_xlog();
 SYS_SWITCH_XLOG 
-----------------
 0/B000238
(1 row)

2、备库切换前后日志信息

** 1)切换前**

2)切换后

3)备库日志被归档

五、总结
在KingbaseES R3的新版本中,可以通过sys_backup.sh脚本执行sys_rman的物理备份,实际上是调用了sys_rman_v6的工具执行了物理备份,和KingbaseES R6的sys_rman的备份原理应该是一致的。

以上是关于KingbaseES V8R3备份恢复案例之---sys_rman物理备份异机恢复的主要内容,如果未能解决你的问题,请参考以下文章

炫“库”行动-人大金仓有奖征文-KingbaseES V8R3 读写分离集群在线扩容案例

炫“库”行动-人大金仓有奖征文-KingbaseES V8R3 读写分离集群在线扩容案例

炫“库”行动-人大金仓有奖征文-KingbaseES V8R3 读写分离集群在线扩容案例

KingbaseES集群管理维护案例之---备库checkpoint分析

KingbaseES R3集群备库执行sys_backup.sh物理备份案例

KingbaseES V8R6集群管理运维案例之---repmgr standby switchover故障