TiDB入门篇-数据物理备份和恢复
Posted 顶尖高手养成计划
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了TiDB入门篇-数据物理备份和恢复相关的知识,希望对你有一定的参考价值。
简介
快照备份是集群全量备份的一种实现。它基于 TiDB 的多版本并发控制 (MVCC) 实现,将指定快照包含的所有数据备份到目标存储中。备份下来的数据大小约等于集群(压缩后的)单副本数据大小。备份完成之后,你可以在一个空集群或不存在数据冲突(相同 schema 或 table)的集群执行快照备份恢复,将集群恢复到快照备份时的数据状态,同时恢复功能会依据集群副本设置恢复出多副本。
TiDB 备份与恢复功能使用概述 | PingCAP 文档中心
全量备份操作
前提
先正常启动一个集群
[root@master ~]# tiup cluster display tidb-deploy
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster display tidb-deploy
Cluster type: tidb
Cluster name: tidb-deploy
Cluster version: v6.5.1
Deploy user: root
SSH type: builtin
Dashboard URL: http://192.168.66.20:2379/dashboard
Grafana URL: http://192.168.66.20:3000
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
-- ---- ---- ----- ------- ------ -------- ----------
192.168.66.20:9093 alertmanager 192.168.66.20 9093/9094 linux/x86_64 Up /tidb-data/alertmanager-9093 /tidb-deploy/alertmanager-9093
192.168.66.20:3000 grafana 192.168.66.20 3000 linux/x86_64 Up - /tidb-deploy/grafana-3000
192.168.66.10:2379 pd 192.168.66.10 2379/2380 linux/x86_64 Up /tidb-data/pd-2379 /tidb-deploy/pd-2379
192.168.66.20:2379 pd 192.168.66.20 2379/2380 linux/x86_64 Up|UI /tidb-data/pd-2379 /tidb-deploy/pd-2379
192.168.66.21:2379 pd 192.168.66.21 2379/2380 linux/x86_64 Up|L /tidb-data/pd-2379 /tidb-deploy/pd-2379
192.168.66.20:9090 prometheus 192.168.66.20 9090/12020 linux/x86_64 Up /tidb-data/prometheus-9090 /tidb-deploy/prometheus-9090
192.168.66.10:4000 tidb 192.168.66.10 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
192.168.66.10:9000 tiflash 192.168.66.10 9000/8123/3930/20170/20292/8234 linux/x86_64 Up /tidb-data/tiflash-9000 /tidb-deploy/tiflash-9000
192.168.66.10:20160 tikv 192.168.66.10 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
192.168.66.20:20160 tikv 192.168.66.20 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
192.168.66.21:20160 tikv 192.168.66.21 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
准备些数据
#先创建些数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
create table temptable(name varchar(10));
insert into temptable values('1');
insert into temptable values('1');
insert into temptable values('1');
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
+------+
3 rows in set (0.00 sec)
执行br全备
1.安装备份工具执行全量备份。
#安装备份工具br
tiup install br
# --backupts '2022-09-08 13:30:00' 指定备份快照时间,如果不指定那么就是命令的执行时间
tiup br backup full --pd "192.168.66.10:2379" \\
--storage "local:///tmp/datatemp" \\
--ratelimit 128 \\
--log-file backupfull.log
打印结果
[root@master ~]# tiup br backup full --pd "192.168.66.10:2379" \\
> --storage "local:///tmp/datatemp" \\
> --ratelimit 128 \\
> --log-file backupfull.log
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br backup full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp --ratelimit 128 --log-file backupfull.log
Detail BR log in backupfull.log
Full Backup <-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Checksum <----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 09:32:19.052 +08:00] [INFO] [collector.go:73] ["Full Backup success summary"] [total-ranges=17] [ranges-succeed=17] [ranges-failed=0] [backup-checksum=14.147284ms] [backup-fast-checksum=3.783683ms] [backup-total-ranges=80] [backup-total-regions=80] [total-take=4.551226548s] [BackupTS=440393304777424899] [total-kv=1227] [total-kv-size=294.5kB] [average-speed=64.71kB/s] [backup-data-size(after-compressed)=78.38kB] [Size=78375]
2.查看下备份文件的快照时间。
#查看备份文件的备份时间
tiup br validate decode --field="end-version" \\
--storage "local:///tmp/datatemp" | tail -n1
结果输出如下,对应物理时间 2022-09-08 13:30:00 +0800 CST:
435844546560000000
3.让每一台机器都有全部的备份文件。
#在192.168.66.10的数据拷贝到192.168.66.20
#在192.168.66.21的数据拷贝到192.168.66.20
#然后备份的所有数据192.168.66.20拷贝到192.168.66.10和192.168.66.21
scp -r /tmp/datatemp root@192.168.66.20:/tmp
scp -r /tmp/datatemp root@192.168.66.20:/tmp
scp -r /tmp/datatemp root@192.168.66.10:/tmp
scp -r /tmp/datatemp root@192.168.66.21:/tmp
打印结果
[root@node2 datatemp]# ll
total 212
drwxr-xr-x 2 root root 4096 Mar 28 09:32 1
drwxr-xr-x 2 root root 210 Mar 28 09:44 4
drwxr-xr-x 2 root root 4096 Mar 28 09:44 5
-rw-r--r-- 1 root root 78 Mar 28 09:44 backup.lock
-rw-r--r-- 1 root root 196315 Mar 28 09:44 backupmeta
-rw-r--r-- 1 root root 7898 Mar 28 09:44 checkpoint.meta
drwxrwxrwx 2 root root 29 Mar 28 09:44 checkpoints
4.删除之前的数据,执行数据恢复功能。
#连接到数据库并且删除数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop database test;
mysql> drop database test;
Query OK, 0 rows affected (0.39 sec)
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| INFORMATION_SCHEMA |
| METRICS_SCHEMA |
| PERFORMANCE_SCHEMA |
| mysql |
+--------------------+
4 rows in set (0.00 sec)
exit;
执行恢复查看数据
#执行恢复
tiup br restore full --pd "192.168.66.10:2379" \\
--storage "local:///tmp/datatemp"
#打印如下
[root@master datatemp]# tiup br restore full --pd "192.168.66.10:2379" \\
> --storage "local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T09.50.34+0800
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 09:50:37.983 +08:00] [INFO] [collector.go:73] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=336.844µs] [restore-ranges=3] [total-take=3.072937062s] [restore-data-size(after-compressed)=5.104kB] [Size=5104] [BackupTS=440393304777424899] [RestoreTS=440393592991907878] [total-kv=5] [total-kv-size=347B] [average-speed=112.9B/s]
#查看数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
+------+
3 rows in set (0.02 sec)
5.如果之前全量备份,后面有数据的加入,然后在执行全量恢复会有什么效果,是恢复到快照的数据,还是数据的合并呢?
下面准备数据
#删除刚才的数据库
drop table temptable;
create table temptable1(name varchar(10));
insert into temptable1 values('1');
create table temptable(name varchar(10));
insert into temptable values('2');
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| temptable |
| temptable1 |
+----------------+
2 rows in set (0.00 sec)
mysql> select * from temptable;
+------+
| name |
+------+
| 2 |
+------+
1 row in set (0.00 sec)
开始快照恢复刚才备份的数据。
tiup br restore full --pd "192.168.66.10:2379" \\
--storage "local:///tmp/datatemp"
下面数据恢复失败了。
[root@master datatemp]# tiup br restore full --pd "192.168.66.10:2379" \\
> --storage "local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T09.57.03+0800
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 09:57:05.519 +08:00] [INFO] [collector.go:73] ["Full Restore failed summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=241.554µs] [restore-ranges=3]
Error: failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch
进入数据库可以看到只恢复了一部分数据。
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from temptable;
+------+
| name |
+------+
| 2 |
| 1 |
| 1 |
+------+
3 rows in set (0.01 sec)
下面删除test库重新恢复数据结果如下,可以看到数据恢复成功。
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop database test;
tiup br restore full --pd "192.168.66.10:2379" \\
--storage "local:///tmp/datatemp"
[root@master datatemp]# tiup br restore full --pd "192.168.66.10:2379" \\
> --storage "local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T10.01.10+0800
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 10:01:12.129 +08:00] [INFO] [collector.go:73] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=255.166µs] [restore-ranges=3] [total-take=2.085316769s] [restore-data-size(after-compressed)=5.104kB] [Size=5104] [BackupTS=440393304777424899] [RestoreTS=440393759492407297] [total-kv=5] [total-kv-size=347B] [average-speed=166.4B/s]
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
Database changed
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
+------+
3 rows in set (0.01 sec)
6.销毁集群然后在恢复数据。
#销毁集群
tiup cluster destroy tidb-deploy
#重新部署
tiup cluster deploy tidb-deploy v6.5.1 ./topology.yaml --user root -p
#查看集群启动状态
tiup cluster display tidb-deploy
tiup cluster start tidb-deploy
#数据恢复
tiup br restore full --pd "192.168.66.10:2379" \\
--storage "local:///tmp/datatemp"
#检查数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
+------+
3 rows in set (0.00 sec)
结论:
可以看出在数据进行快照备份的时候,要恢复前先停止数据的写入,然后在所有的数据都清除的情况下执行快照恢复。
恢复单个数据库的数据
1.删除对应的数据库。
drop database test;
exit;
2.恢复指定的数据库数据。
tiup br restore db --pd "192.168.66.10:2379" \\
--db "test" \\
--storage "local:///tmp/datatemp"
3.验证。
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| temptable |
+----------------+
1 row in set (0.01 sec)
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
+------+
3 rows in set (0.01 sec)
恢复单张表的数据
恢复单张表的数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop table temptable;
show tables;
mysql> drop table temptable;
Query OK, 0 rows affected (0.32 sec)
mysql> show tables;
Empty set (0.00 sec)
exit;
tiup br restore table --pd "192.168.66.10:2379" \\
--db "test" \\
--table "temptable" \\
--storage "local:///tmp/datatemp"
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql>
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
+------+
3 rows in set (0.00 sec)
恢复 mysql
数据库下的表
自 br
v5.1.0 开始,快照备份会备份 mysql schema 下的系统表数据,而不会默认恢复这些数据。自 br
v6.2.0 开始,在设置 --with-sys-table
下,恢复数据时将同时恢复部分系统表相关数据。
TiDB 快照备份与恢复使用指南 | PingCAP 文档中心
增量日志数据备份
查看下实验开始的数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
Database changed
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
+------+
3 rows in set (0.00 sec)
开启日志数据增量备份
1.开启任务。
#执行日志备份
tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \\
--storage="local:///tmp/datatemplog-backup"
打印内容
[root@master ~]# tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \\
> --storage="local:///tmp/datatemplog-backup"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log start --task-name=pitr --pd=192.168.66.10:2379 --storage=local:///tmp/datatemplog-backup
Detail BR log in /tmp/br.log.2023-03-28T10.37.15+0800
[2023/03/28 10:37:15.931 +08:00] [INFO] [collector.go:73] ["log start"] [streamTaskInfo="taskName=pitr,startTs=440394327217668097,endTS=999999999999999999,tableFilter=*.*"] [pausing=false] [rangeCount=2]
[2023/03/28 10:37:18.046 +08:00] [INFO] [collector.go:73] ["log start success summary"] [total-ranges=0] [ranges-succeed=0] [ranges-failed=0] [backup-checksum=2.666236ms] [total-take=2.35706338s]
2.查看任务。
tiup br log status --task-name=pitr --pd="192.168.66.10:2379"
打印如下
[root@master ~]# tiup br log status --task-name=pitr --pd="192.168.66.10:2379"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log status --task-name=pitr --pd=192.168.66.10:2379
Detail BR log in /tmp/br.log.2023-03-28T10.38.23+0800
● Total 1 Tasks.
> #1 <
name: pitr
status: ● NORMAL
start: 2023-03-28 10:37:15.834 +0800
end: 2090-11-18 22:07:45.624 +0800
storage: local:///tmp/datatemplog-backup
speed(est.): 0.00 ops/s
checkpoint[global]: 2023-03-28 10:37:15.834 +0800; gap=1m8s
3.新添加些数据。
#新添加些数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
insert into temptable values('2');
select * from temptable;
mysql> insert into temptable values('2');
Query OK, 1 row affected (0.01 sec)
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
| 2 |
+------+
4 rows in set (0.00 sec)
使用Point-in-time recovery (PITR)指定数据恢复
TiDB 日志备份与 PITR 使用指南 | PingCAP 文档中心
1.先记销毁集群模拟故障。
#销毁集群
tiup cluster destroy tidb-deploy
#重新部署
tiup cluster deploy tidb-deploy v6.5.1 ./topology.yaml --user root -p
#查看集群启动状态
tiup cluster display tidb-deploy
tiup cluster start tidb-deploy
2.检查数据。
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
mysql> use test;
Database changed
mysql> select * from temptable;
ERROR 1146 (42S02): Table 'test.temptable' doesn't exist
2.准备恢复数据。
#如果是分布式数据存储这一步就不需要
cd /tmp/datatemplog-backup
#把增量的数据都发送到其他机器
scp -r /tmp/datatemplog-backup root@192.168.66.20:/tmp
scp -r /tmp/datatemplog-backup root@192.168.66.21:/tmp
3.执行恢复。
tiup br restore point --pd "192.168.66.10:2379" \\
--storage="local:///tmp/datatemplog-backup" \\
--full-backup-storage="local:///tmp/datatemp"
打印如下
[root@master ~]# tiup br restore point --pd "192.168.66.10:2379" \\
> --storage="local:///tmp/datatemplog-backup" \\
> --full-backup-storage="local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/datatemplog-backup --full-backup-storage=local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T10.57.44+0800
Error: the full snapshot(from cluster ID:7215396801192470494) and log(from cluster ID:7215412685566300640) come from different cluster.: [BR:Common:ErrInvalidArgument]invalid argument
结论:
也就是如果是这种操作,销毁集群,创建新的集群的时候这种恢复有错误。
下面执行快照恢复,执行快照恢复成功
[root@master ~]# tiup br restore full --pd "192.168.66.10:2379" \\
> --storage "local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore full --pd 192.168.66.10:2379 --storage local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T11.00.13+0800
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 11:00:17.001 +08:00] [INFO] [collector.go:73] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=199.397µs] [restore-ranges=3] [total-take=3.107023133s] [RestoreTS=440394688506888228] [total-kv=5] [total-kv-size=347B] [average-speed=111.7B/s] [restore-data-size(after-compressed)=5.104kB] [Size=5104] [BackupTS=440393304777424899]
(重点)如果全量备份是之前集群的问题(恢复失败)
#启动增量日志备份数据
tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \\
--storage="local:///tmp/datatemplog-nihao"
tiup br log status --task-name=pitr --pd="192.168.66.10:2379"
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log status --task-name=pitr --pd=192.168.66.10:2379
Detail BR log in /tmp/br.log.2023-03-28T11.05.00+0800
● Total 1 Tasks.
> #1 <
name: pitr
status: ● NORMAL
start: 2023-03-28 11:04:51.492 +0800
end: 2090-11-18 22:07:45.624 +0800
storage: local:///tmp/datatemplog-nihao
speed(est.): 0.00 ops/s
checkpoint[global]: 2023-03-28 11:04:51.492 +0800; gap=10s
#恢复指定时间数据
tiup br restore point --pd "192.168.66.10:2379" \\
--storage="local:///tmp/datatemplog-backup" \\
--full-backup-storage="local:///tmp/datatemp"
--restored-ts '2022-05-15 18:00:00+0800'
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
insert into temptable values('2');
insert into temptable values('3');
insert into temptable values('4');
select * from temptable;
mysql> insert into temptable values('4');
Query OK, 1 row affected (0.00 sec)
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 1 |
| 1 |
| 2 |
| 3 |
| 4 |
+------+
6 rows in set (0.01 sec)
use test;
delete from temptable;
select * from temptable;
#让每一台机器都有对应的完整的数据
scp -r /tmp/datatemplog-nihao root@192.168.66.20:/tmp
scp -r /tmp/datatemplog-nihao root@192.168.66.21:/tmp
#执行恢复
tiup br restore point --pd "192.168.66.10:2379" \\
--storage="local:///tmp/datatemplog-nihao" \\
--full-backup-storage="local:///tmp/datatemp"
--restored-ts '2023-03-28 11:11:07+0800'
[root@master ~]# tiup br restore point --pd "192.168.66.10:2379" \\
> --storage="local:///tmp/datatemplog-nihao" \\
> --full-backup-storage="local:///tmp/datatemp"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/datatemplog-nihao --full-backup-storage=local:///tmp/datatemp
Detail BR log in /tmp/br.log.2023-03-28T11.16.52+0800
Error: failed to check task exits: log backup task is running: pitr, please stop the task before restore, and after PITR operation finished, create log-backup task again and create a full backup on this cluster
#查看下状态
tiup br log status --task-name=pitr --pd="192.168.66.10:2379"
#停止下
tiup br log stop --task-name=pitr
#然后执行
tiup br restore point --pd "192.168.66.10:2379" \\
--storage="local:///tmp/datatemplog-nihao"
--full-backup-storage="local:///tmp/datatemp"
--restored-ts '2023-03-28 11:11:07+0800'
> --storage="local:///tmp/datatemplog-nihao"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/datatemplog-nihao
Detail BR log in /tmp/br.log.2023-03-28T11.19.56+0800
Error: restore log from 0(1970-01-01 08:00:00 +0800 CST) to 440394761238478850(2023-03-28 11:04:51.492 +0800 CST), but the current existed log from 440394761238478850(2023-03-28 11:04:51.492 +0800 CST) to 440394761238478850(2023-03-28 11:04:51.492 +0800 CST): [BR:Common:ErrInvalidArgument]invalid argument
结论:
如果之前的全量备份数据是之前集群的,增量的数据是后面产生的,那么PITR恢复会失败。
数据实验
1.检查集群健康状态
#查看集群数据,下面表示集群正常
[root@master ~]# tiup cluster display tidb-deploy
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster display tidb-deploy
Cluster type: tidb
Cluster name: tidb-deploy
Cluster version: v6.5.1
Deploy user: root
SSH type: builtin
Dashboard URL: http://192.168.66.20:2379/dashboard
Grafana URL: http://192.168.66.20:3000
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
-- ---- ---- ----- ------- ------ -------- ----------
192.168.66.20:9093 alertmanager 192.168.66.20 9093/9094 linux/x86_64 Up /tidb-data/alertmanager-9093 /tidb-deploy/alertmanager-9093
192.168.66.20:3000 grafana 192.168.66.20 3000 linux/x86_64 Up - /tidb-deploy/grafana-3000
192.168.66.10:2379 pd 192.168.66.10 2379/2380 linux/x86_64 Up|L /tidb-data/pd-2379 /tidb-deploy/pd-2379
192.168.66.20:2379 pd 192.168.66.20 2379/2380 linux/x86_64 Up|UI /tidb-data/pd-2379 /tidb-deploy/pd-2379
192.168.66.21:2379 pd 192.168.66.21 2379/2380 linux/x86_64 Up /tidb-data/pd-2379 /tidb-deploy/pd-2379
192.168.66.20:9090 prometheus 192.168.66.20 9090/12020 linux/x86_64 Up /tidb-data/prometheus-9090 /tidb-deploy/prometheus-9090
192.168.66.10:4000 tidb 192.168.66.10 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
192.168.66.10:9000 tiflash 192.168.66.10 9000/8123/3930/20170/20292/8234 linux/x86_64 Up /tidb-data/tiflash-9000 /tidb-deploy/tiflash-9000
192.168.66.10:20160 tikv 192.168.66.10 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
192.168.66.20:20160 tikv 192.168.66.20 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
192.168.66.21:20160 tikv 192.168.66.21 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
Total nodes: 11
2.准备数据。
#插入写准备好的数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
insert into temptable values('2');
insert into temptable values('3');
insert into temptable values('4');
select * from temptable;
mysql> select * from temptable;
+------+
| name |
+------+
| 2 |
| 3 |
| 4 |
+------+
3 rows in set (0.01 sec)
3.执行全量备份
mkdir /tmp/full_data
chmod 777 /tmp/full_data
tiup br backup full --pd "192.168.66.10:2379" \\
--storage "local:///tmp/full_data" \\
--ratelimit 128 \\
--log-file backupfull.log
#下面是打印结果
[root@master ~]# tiup br backup full --pd "192.168.66.10:2379" \\
> --storage "local:///tmp/full_data" \\
> --ratelimit 128 \\
> --log-file backupfull.log
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br backup full --pd 192.168.66.10:2379 --storage local:///tmp/full_data --ratelimit 128 --log-file backupfull.log
Detail BR log in backupfull.log
Full Backup <-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Checksum <----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 12:26:18.986 +08:00] [INFO] [collector.go:73] ["Full Backup success summary"] [total-ranges=19] [ranges-succeed=19] [ranges-failed=0] [backup-checksum=13.20605ms] [backup-fast-checksum=4.260555ms] [backup-total-regions=80] [backup-total-ranges=80] [total-take=4.306455012s] [BackupTS=440396021403222019] [total-kv=1266] [total-kv-size=359.4kB] [average-speed=83.45kB/s] [backup-data-size(after-compressed)=86.52kB] [Size=86517]
4.开始日志增量备份
tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \\
--storage="local:///tmp/full_data-upload"
#打印结果
[root@master ~]# tiup br log start --task-name=pitr --pd="192.168.66.10:2379" \\
> --storage="local:///tmp/full_data-upload"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log start --task-name=pitr --pd=192.168.66.10:2379 --storage=local:///tmp/full_data-upload
Detail BR log in /tmp/br.log.2023-03-28T12.27.46+0800
[2023/03/28 12:27:46.499 +08:00] [INFO] [collector.go:73] ["log start"] [streamTaskInfo="taskName=pitr,startTs=440396065393344515,endTS=999999999999999999,tableFilter=*.*"] [pausing=false] [rangeCount=2]
[2023/03/28 12:27:48.681 +08:00] [INFO] [collector.go:73] ["log start success summary"] [total-ranges=0] [ranges-succeed=0] [ranges-failed=0] [backup-checksum=1.447661ms] [total-take=2.360156816s]
5.添加些新增数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
insert into temptable values('100');
insert into temptable values('300');
insert into temptable values('400');
select * from temptable;
#打印结果
mysql> select * from temptable;
+------+
| name |
+------+
| 2 |
| 3 |
| 4 |
| 100 |
| 300 |
| 400 |
+------+
6 rows in set (0.00 sec)
6.检查备份数据(下面的操作是每一台机器都有全部的增量数据和全量数据,因为这样如果没有分布式文件系统那么每一台机器只有一部分数据)
#先关闭下日志增量数据同步
tiup br log stop --task-name=pitr
cd /tmp/full_data-upload
cd /tmp/full_data
scp -r /tmp/full_data root@192.168.66.20:/tmp
scp -r /tmp/full_data-upload root@192.168.66.20:/tmp
scp -r /tmp/full_data root@192.168.66.20:/tmp
scp -r /tmp/full_data-upload root@192.168.66.20:/tmp
scp -r /tmp/full_data root@192.168.66.10:/tmp
scp -r /tmp/full_data root@192.168.66.21:/tmp
scp -r /tmp/full_data-upload root@192.168.66.10:/tmp
scp -r /tmp/full_data-upload root@192.168.66.21:/tmp
7.删除现在数据库里面的数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop database test;
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| INFORMATION_SCHEMA |
| METRICS_SCHEMA |
| PERFORMANCE_SCHEMA |
| mysql |
+--------------------+
4 rows in set (0.00 sec)
8.全量加增量恢复
#--restored-ts '2023-03-28 12:35:00+0800' 指定恢复时间点
tiup br restore point --pd "192.168.66.10:2379" \\
--storage="local:///tmp/full_data-upload" \\
--full-backup-storage="local:///tmp/full_data"
#打印如下,下面说明了要先启动日志增量备份,然后启动全量备份才行,不然有日志缺口
[root@master full_data]# tiup br restore point --pd "192.168.66.10:2379" \\
> --storage="local:///tmp/full_data-upload" \\
> --full-backup-storage="local:///tmp/full_data"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/full_data-upload --full-backup-storage=local:///tmp/full_data
Detail BR log in /tmp/br.log.2023-03-28T12.43.47+0800
Error: it has gap between full backup ts:440396021403222019(2023-03-28 12:24:58.639 +0800 CST) and log backup ts:440396065393344515(2023-03-28 12:27:46.448 +0800 CST). : [BR:Common:ErrInvalidArgument]invalid argument
9.检查下现在的数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
mysql> use test;
ERROR 1049 (42000): Unknown database 'test'
10.先启动日志增量备份
tiup br log start --task-name=pitrone --pd="192.168.66.10:2379" \\
--storage="local:///tmp/full_data-uploadd"
#查看下状态
tiup br log status --task-name=pitrone
[root@master full_data]# tiup br log status --task-name=pitrone
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log status --task-name=pitrone
Detail BR log in /tmp/br.log.2023-03-28T12.49.34+0800
● Total 1 Tasks.
> #1 <
name: pitrone
status: ● NORMAL
start: 2023-03-28 12:49:01.889 +0800
end: 2090-11-18 22:07:45.624 +0800
storage: local:///tmp/full_data-uploadd
speed(est.): 0.00 ops/s
checkpoint[global]: 2023-03-28 12:49:01.889 +0800; gap=34s
11.新添加些数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
create database test;
use test;
create table temptable(name varchar(20));
select * from temptable;
insert into temptable values('1');
insert into temptable values('3');
insert into temptable values('4');
select * from temptable;
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 3 |
| 4 |
+------+
3 rows in set (0.00 sec)
12.执行全量备份
mkdir /tmp/full_datad
chmod 777 /tmp/full_datad
tiup br backup full --pd "192.168.66.10:2379" \\
--storage "local:///tmp/full_datad" \\
--ratelimit 128 \\
--log-file backupfull.log
#下面是打印结果
[root@master full_data-upload]# tiup br backup full --pd "192.168.66.10:2379" \\
> --storage "local:///tmp/full_datad" \\
> --ratelimit 128 \\
> --log-file backupfull.log
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br backup full --pd 192.168.66.10:2379 --storage local:///tmp/full_datad --ratelimit 128 --log-file backupfull.log
Detail BR log in backupfull.log
Full Backup <-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Checksum <----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 12:58:40.782 +08:00] [INFO] [collector.go:73] ["Full Backup success summary"] [total-ranges=21] [ranges-succeed=21] [ranges-failed=0] [backup-checksum=25.683732ms] [backup-fast-checksum=5.276318ms] [backup-total-ranges=80] [backup-total-regions=80] [total-take=4.335096191s] [total-kv-size=362.7kB] [average-speed=83.66kB/s] [backup-data-size(after-compressed)=90.19kB] [Size=90188] [BackupTS=440396550632374275] [total-kv=1271]
13.执行完全量备份以后插入些数据
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
use test;
select * from temptable;
insert into temptable values('100');
insert into temptable values('300');
insert into temptable values('400');
select * from temptable;
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 3 |
| 4 |
| 100 |
| 300 |
| 400 |
+------+
6 rows in set (0.01 sec)
插入完数据的时间现在是13:00
14.13:02的时候执行数据删除操作
mysql --comments --host 127.0.0.1 --port 4000 -u root -p
drop database test;
show databases;
15.关闭日志增量数据
tiup br log stop --task-name=pitrone
#打印如下
[root@master full_data-upload]# tiup br log stop --task-name=pitrone
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br log stop --task-name=pitrone
Detail BR log in /tmp/br.log.2023-03-28T13.03.21+0800
[2023/03/28 13:03:21.765 +08:00] [INFO] [collector.go:73] ["log stop"] [streamTaskInfo="taskName=pitrone,startTs=440396399742550018,endTS=999999999999999999,tableFilter=*.*"]
[2023/03/28 13:03:24.016 +08:00] [INFO] [collector.go:73] ["log stop success summary"] [total-ranges=0] [ranges-succeed=0] [ranges-failed=0] [total-take=2.345402948s]
16.同步每一个节点的数据使得数据都有全部的增量和全量数据
#192.168.66.10执行
scp -r /tmp/full_datad root@192.168.66.20:/tmp
scp -r /tmp/full_data-uploadd root@192.168.66.20:/tmp
#192.168.66.21执行
scp -r /tmp/full_datad root@192.168.66.20:/tmp
scp -r /tmp/full_data-uploadd root@192.168.66.20:/tmp
#192.168.66.20执行
scp -r /tmp/full_datad root@192.168.66.10:/tmp
scp -r /tmp/full_datad root@192.168.66.21:/tmp
scp -r /tmp/full_data-uploadd root@192.168.66.10:/tmp
scp -r /tmp/full_data-uploadd root@192.168.66.21:/tmp
17.执行指定时间恢复,上面的操作是13:01:00之后执行了删除操作,现在恢复13:01:00之前的数据
#--restored-ts '2023-03-28 12:35:00+0800' 指定恢复时间点
tiup br restore point --pd "192.168.66.10:2379" \\
--storage="local:///tmp/full_data-uploadd" \\
--full-backup-storage="local:///tmp/full_datad" \\
--restored-ts '2023-03-28 13:01:00+0800'
#下面显示,全量和增量日志备份有一部分是重复的
[root@master full_data]# tiup br restore point --pd "192.168.66.10:2379" \\
> --storage="local:///tmp/full_data-uploadd" \\
> --full-backup-storage="local:///tmp/full_datad" \\
> --restored-ts '2023-03-28 13:01:00+0800'
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/full_data-uploadd --full-backup-storage=local:///tmp/full_datad --restored-ts 2023-03-28 13:01:00+0800
Detail BR log in /tmp/br.log.2023-03-28T13.10.15+0800
Error: restore log from 440396550632374275(2023-03-28 12:58:37.488 +0800 CST) to 440396587991040000(2023-03-28 13:01:00 +0800 CST), but the current existed log from 440396399742550018(2023-03-28 12:49:01.889 +0800 CST) to 440396399742550018(2023-03-28 12:49:01.889 +0800 CST): [BR:Common:ErrInvalidArgument]invalid argument
[root@master full_data]# vi /tmp/br.log.2023-03-28T13.10.15+0800
18.清除全量快照之前的日志增量数据
FULL_BACKUP_TS=`tiup br validate decode --field="end-version" --storage "local:///tmp/full_datad"| tail -n1`
tiup br log truncate --until=$FULL_BACKUP_TS --storage="local:///tmp/full_data-uploadd"
#执行恢复
#--restored-ts '2023-03-28 13:01:00+0800' 指定时间恢复失败
tiup br restore point --pd "192.168.66.10:2379" \\
--storage="local:///tmp/full_data-uploadd" \\
--full-backup-storage="local:///tmp/full_datad"
[root@master full_data-upload]# tiup br restore point --pd "192.168.66.10:2379" \\
> --storage="local:///tmp/full_data-uploadd" \\
> --full-backup-storage="local:///tmp/full_datad"
tiup is checking updates for component br ...
Starting component `br`: /root/.tiup/components/br/v6.6.0/br restore point --pd 192.168.66.10:2379 --storage=local:///tmp/full_data-uploadd --full-backup-storage=local:///tmp/full_datad
Detail BR log in /tmp/br.log.2023-03-28T13.32.18+0800
Full Restore <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/03/28 13:32:22.795 +08:00] [INFO] [collector.go:73] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=12] [ranges-failed=0] [split-region=223.347µs] [restore-ranges=3] [total-take=4.185697983s] [restore-data-size(after-compressed)=5.109kB] [Size=5109] [BackupTS=440396550632374275] [RestoreTS=440397080766185509] [total-kv=5] [total-kv-size=347B] [average-speed=82.9B/s]
Restore Meta Files <.........................................................................................................................................................................................> ?%
Restore KV Files <...........................................................................................................................................................................................> ?%
[2023/03/28 13:32:23.881 +08:00] [INFO] [collector.go:73] ["restore log success summary"] [total-take=1.0855745s] [restore-from=440396550632374275] [restore-to=440396550632374275] [restore-from="2023-03-28 12:58:37.488 +0800"] [restore-to="2023-03-28 12:58:37.488 +0800"] [total-kv-count=0] [total-size=0B] [average-speed=0B/s]
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| temptable |
+----------------+
1 row in set (0.00 sec)
mysql> select * from temptable;
+------+
| name |
+------+
| 1 |
| 3 |
| 4 |
+------+
3 rows in set (0.00 sec)
# 指定时间恢复失败
tiup br restore point --pd "192.168.66.10:2379" \\
--storage="local:///tmp/full_data-uploadd" \\
--full-backup-storage="local:///tmp/full_datad" \\
--restored-ts '2023-03-28 13:00:00+0800'
Detail BR log in /tmp/br.log.2023-03-28T13.35.20+0800
Error: restore log from 440396550632374275(2023-03-28 12:58:37.488 +0800 CST) to 440396572262400000(2023-03-28 13:00:00 +0800 CST), but the current existed log from 440396550632374275(2023-03-28 12:58:37.488 +0800 CST) to 440396550632374275(2023-03-28 12:58:37.488 +0800 CST): [BR:Common:ErrInvalidArgument]invalid argument
结论:
指定时间恢复有点问题,如果加了指点的时间段进行恢复,总是会有点冲突,在全量和增量一块的时候,得先开启日志增量数据同步,然后在弄全量同步才能用这个功能,指定时间段恢复有点问题。
TiDB Binlog工具Drainer使用
一、TiDB Binlog简介
TiDB Binlog是一个用于收集TiDB的binlog,并提供准实时备份和同步功能的商业化工具。
TiDB Blinlog支持以下功能场景:
• 数据同步:同步 TiDB 集群数据到其他数据库
• 实时备份和恢复:备份 TiDB 集群数据,同时可以用于 TiDB 集群故障时恢复
注意
TiDB Binlog 与 TiDB v5.0 版本开始引入的一些特性不兼容,无法一起使用,详情参照注意事项。建议使用 TiCDC 替代 TiDB Binlog。
二、TiDB Binlog整体架构
2.1、组件的功能
TiDB Binlog 集群主要分为 Pump 和 Drainer 两个组件,以及 binlogctl 工具:
Pump
Pump 用于实时记录 TiDB 产生的 Binlog,并将 Binlog 按照事务的提交时间进行排序,再提供给 Drainer 进行消费。
Drainer
Drainer 从各个 Pump 中收集 Binlog 进行归并,再将 Binlog 转化成 SQL 或者指定格式的数据,最终同步到下游。
binlogctl 工具
binlogctl 是一个 TiDB Binlog 配套的运维工具,具有如下功能:
• 获取 TiDB 集群当前的 TSO
• 查看 Pump/Drainer 状态
• 修改 Pump/Drainer 状态
• 暂停/下线 Pump/Drainer
2.2、主要特性
• 多个 Pump 形成一个集群,可以水平扩容。
• TiDB 通过内置的 Pump Client 将 Binlog 分发到各个 Pump。
• Pump 负责存储 Binlog,并将 Binlog 按顺序提供给 Drainer。
• Drainer 负责读取各个 Pump 的 Binlog,归并排序后发送到下游。
• Drainer 支持 relay log 功能,通过 relay log 保证下游集群的一致性状态。
2.3、注意事项
• TiDB Binlog 和 TiDB 在 v5.1 版本中解决了 v5.0 版本中引入的聚簇索引与 TiDB Binlog 不兼容问题。在升级 TiDB Binlog 和 TiDB Server 到 v5.1 版本后:开启 TiDB Binlog 后,TiDB 支持创建聚簇索引表;聚簇索引表的数据插入、删除和更新操作支持通过 TiDB Binlog 同步到下游。对于同步聚簇索引表时需注意:
○ 如果从 v5.0 版本手动控制组件升级顺序进行升级,请确保先将 TiDB Binlog 升级至 v5.1 版本后再将 TiDB Server 升级至 v5.1 版本。
○ 推荐将上下游的 TiDB 系统变量 tidb_enable_clustered_index 配置为一致的值来保证上下游 TiDB 聚簇索引表结构一致。
• TiDB Binlog 与 TiDB v5.0 版本开始引入的以下特性不兼容,无法一起使用:
○ TiDB 聚簇索引特性:开启 TiDB Binlog 后 TiDB 不允许创建非单个整数列作为主键的聚簇索引;已创建的聚簇索引表的数据插入、删除和更新动作不会通过 TiDB Binlog 同步到下游。如需同步聚簇索引表,请升级至 v5.1 版本或使用 TiCDC;
○ TiDB 系统变量 tidb_enable_async_commit:启用 TiDB Binlog 后,开启该选项无法获得性能提升。要获得性能提升,建议使用 TiCDC 替代 TiDB Binlog。
○ TiDB 系统变量 tidb_enable_1pc:启用 TiDB Binlog 后,开启该选项无法获得性能提升。要获得性能提升,建议使用 TiCDC 替代 TiDB Binlog。
• TiDB Binlog 与 TiDB v4.0.7 版本开始引入的以下特性不兼容,无法一起使用:
○ TiDB 系统变量 tidb_enable_amend_pessimistic_txn:两个功能存在兼容性问题,一起使用会造成 TiDB Binlog 复制数据不一致的正确性问题。
• Drainer 支持将 Binlog 同步到 MySQL、TiDB、Kafka 或者本地文件。如果需要将 Binlog 同步到其他 Drainer 不支持的类型的系统中,可以设置 Drainer 将 Binlog 同步到 Kafka,然后根据 binlog consumer protocol 进行定制处理,参考 Binlog Consumer Client 用户文档。
• 如果 TiDB Binlog 用于增量恢复,可以设置配置项 db-type="file",Drainer 会将 binlog 转化为指定的 proto buffer 格式的数据,再写入到本地文件中。这样就可以使用 Reparo 恢复增量数据。
关于 db-type 的取值,应注意:
○ 如果 TiDB 版本 < 2.1.9,则 db-type="pb"。
○ 如果 TiDB 版本 > = 2.1.9,则 db-type="file" 或 db-type="pb"。
• 如果下游为 MySQL/TiDB,数据同步后可以使用 sync-diff-inspector 进行数据校验。
三、TiDB Binlog数据同步到mysql
3.1、环境架构图
序号 IP 角色 备注
1 172.16.5.122:2379 PD(leader)
2 172.16.5.122:4000 tidb
3 172.16.5.122:4000 tikv
4 172.16.6.151:20160 tikv
5 172.16.6.236:20160 tikv
6 172.16.5.146:8249 drainer
7 172.16.5.122:8250 pump
8 172.16.6.130:3306 Mysql server
3.2、配置下游的mysql
下游的mysql数据库地址为172.16.6.130:3306
CoffeeScript
mysql> create user root@% identified by mysql;
Query OK, 0 rows affected (0.01 sec)
mysql> grant all privileges on . to root@%;
#备注,相关的配置文件
[root@k8s-master ~]# cat drainer_scale-out.toml
drainer_servers:
- host: 172.16.5.146
commit_ts: -1
config:
syncer.db-type: "mysql"
syncer.to.host: "172.16.6.130"
syncer.to.user: "root"
syncer.to.password: "mysql"
syncer.to.port: 3306
syncer.replicate-do-db: ["testlqb"]
3.3、配置上游的Tidb数据库
1、登陆tidb数据库,在testlqb数据库中创建表T,为验证数据做准备
C#
MySQL [(none)]> create database testlqb;
Query OK, 0 rows affected (0.12 sec)
MySQL [(none)]>
MySQL [(none)]> use testlqb;
Database changed
MySQL [testlqb]> create table T(id int primary key, name varchar(30))
-> ;
Query OK, 0 rows affected (0.13 sec)
MySQL [testlqb]> insert into T values(1,tom);
Query OK, 1 row affected (0.01 sec)
MySQL [testlqb]> insert into T values(2,Jack);
Query OK, 1 row affected (0.01 sec)
MySQL [testlqb]> insert into T values(3,Frank);
Query OK, 1 row affected (0.01 sec)
MySQL [testlqb]> select * from T;
+----+-------+
| id | name |
+----+-------+
| 1 | tom |
| 2 | Jack |
| 3 | Frank |
+----+-------+
3 rows in set (0.01 sec)
2、登陆下游的mysql,在lqb数据库中创建表T
C#
mysql> create database lqb;
Query OK, 1 row affected (0.00 sec)
mysql> create table T(id int primary key, name varchar(30))
-> ;
Query OK, 0 rows affected (0.01 sec)
mysql> insert into T values(1,tom);
Query OK, 1 row affected (0.00 sec)
mysql> insert into T values(3,Frank);
Query OK, 1 row affected (0.00 sec)
mysql> insert into T values(2,Jack);
Query OK, 1 row affected (0.00 sec)
mysql> select * from T;
+----+-------+
| id | name |
+----+-------+
| 1 | tom |
| 2 | Jack |
| 3 | Frank |
+----+-------+
3 rows in set (0.00 sec)
3.4、安装Drainer和pump
drainer的安装有两种方式,一种是tiup,另一种是Binary 部署 TiDB Binlog。推荐使用tiup。扩容出pump节点和drainer节点为后边数据同步复制做准备。
1、查看tiup集群的状况,tidb安装略
Haskell
[root@k8s-master ~]# tiup cluster display tidb01
tiup is checking updates for component cluster ...
Starting component cluster
: /root/.tiup/components/cluster/v1.9.5/tiup-cluster /root/.tiup/components/cluster/v1.9.5/tiup-cluster display tidb01
Cluster type: tidb
Cluster name: tidb01
Cluster version: v4.0.12
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://172.16.5.146:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
172.16.5.122:9093 alertmanager 172.16.5.122 9093/9094 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/alertmanager-9093 /home/tidb/tidb/v4.0.12/tidb-deploy/alertmanager-9093
172.16.5.122:3000 grafana 172.16.5.122 3000 linux/x86_64 Up - /home/tidb/tidb/v4.0.12/tidb-deploy/grafana-3000
172.16.5.122:2379 pd 172.16.5.122 2379/2380 linux/x86_64 Up|L /home/tidb/tidb/v4.0.12/tidb-data/pd-2379 /home/tidb/tidb/v4.0.12/tidb-deploy/pd-2379
172.16.5.146:2379 pd 172.16.5.146 2379/2380 linux/x86_64 Up|UI /home/tidb/tidb/v4.0.12/tidb-data/pd-2379 /home/tidb/tidb/v4.0.12/tidb-deploy/pd-2379
172.16.6.236:2379 pd 172.16.6.236 2379/2380 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/pd-2379 /home/tidb/tidb/v4.0.12/tidb-deploy/pd-2379
172.16.5.122:9090 prometheus 172.16.5.122 9090 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/prometheus-9090 /home/tidb/tidb/v4.0.12/tidb-deploy/prometheus-9090
172.16.5.122:4000 tidb 172.16.5.122 4000/10080 linux/x86_64 Up - /home/tidb/tidb/v4.0.12/tidb-deploy/tidb-4000
172.16.5.146:20160 tikv 172.16.5.146 20160/20180 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/tikv-20160 /home/tidb/tidb/v4.0.12/tidb-deploy/tikv-20160
172.16.6.151:20160 tikv 172.16.6.151 20160/20189 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/tikv-20160 /home/tidb/tidb/v4.0.12/tidb-deploy/tikv-20160
172.16.6.236:20160 tikv 172.16.6.236 20160/20180 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/tikv-20160 /home/tidb/tidb/v4.0.12/tidb-deploy/tikv-20160
2、创建扩展文件
CSS
[root@k8s-master ~]# cat drainer_scale-out.toml
drainer_servers:
- host: 172.16.5.146
commit_ts: -1
config:
syncer.db-type: "mysql"
syncer.to.host: "172.16.6.130"
syncer.to.user: "root"
syncer.to.password: "mysql"
syncer.to.port: 3306
syncer.replicate-do-db: ["testlqb"]
3、扩展drainer
SQL
1、检查配置文件
tiup cluster check tidb01 drainer_scale-out.toml --cluster --apply --user root -i /root/.ssh/id_rsa
2、扩展drainer
tiup cluster scale-out tidb01 drainer_scale-out.toml -i /root/.ssh/id_rsa
3、查看是否成功
[root@k8s-master ~]# tiup cluster display tidb01
tiup is checking updates for component cluster ...
Starting component cluster
: /root/.tiup/components/cluster/v1.9.5/tiup-cluster /root/.tiup/components/cluster/v1.9.5/tiup-cluster display tidb01
Cluster type: tidb
Cluster name: tidb01
Cluster version: v4.0.12
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://172.16.5.146:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
172.16.5.122:9093 alertmanager 172.16.5.122 9093/9094 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/alertmanager-9093 /home/tidb/tidb/v4.0.12/tidb-deploy/alertmanager-9093
172.16.5.146:8249 drainer 172.16.5.146 8249 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/drainer-8249 /home/tidb/tidb/v4.0.12/tidb-deploy/drainer-8249
4、通过tiup对drainer进行管理
Gherkin
1、关闭drainer
[root@k8s-master ~]# tiup cluster stop tidb01 -N 172.16.5.146:8249
2、启动drainer
[root@k8s-master ~]# tiup cluster start tidb01 -N 172.16.5.146:8249
3、对数据库进行查询
MySQL [testlqb]> show drainer status;
+-------------------+-------------------+--------+--------------------+---------------------+
| NodeID | Address | State | Max_Commit_Ts | Update_Time |
+-------------------+-------------------+--------+--------------------+---------------------+
| 172.16.5.146:8249 | 172.16.5.146:8249 | paused | 433333952724074497 | 2022-05-20 17:10:47 |
+-------------------+-------------------+--------+--------------------+---------------------+
1 row in set (0.02 sec)
MySQL [testlqb]> show drainer status;
+-------------------+-------------------+--------+--------------------+---------------------+
| NodeID | Address | State | Max_Commit_Ts | Update_Time |
+-------------------+-------------------+--------+--------------------+---------------------+
| 172.16.5.146:8249 | 172.16.5.146:8249 | online | 433333952724074497 | 2022-05-20 17:11:18 |
+-------------------+-------------------+--------+--------------------+---------------------+
注意事项
- drainer重启后,会从上次复制的断点,继续同步日志。
复制TSO信息保存在目标库的tidb_binlog_checkpoint表中。 - drainer启动后的5分钟内,自动启用safe-mode模式。
会用 replace 替换 insert 语句,用 delete + replace 替换 update 语句。
5、安装pump,步骤和drainer一样
YAML
1创建pump扩展文件
[root@k8s-master ~]# cat pump_scale-out.toml
pump_servers:
- host: 172.16.5.122
ssh_port: 22
port: 8250
deploy_dir: /home/tidb/tidb/v4.0.12/tidb-deploy/pump-8250
data_dir: /home/tidb/tidb/v4.0.12/tidb-data/pump-8250
log_dir: /home/tidb/tidb/v4.0.12/tidb-deploy/pump-8250/log
arch: amd64
os: linux
2、检查配置
[root@k8s-master ~]# tiup cluster check tidb01 pump_scale-out.toml --cluster --apply --user root -i /root/.ssh/id_rsa
3、安装
[root@k8s-master ~]# tiup cluster scale-out tidb01 pump_scale-out.toml -i /root/.ssh/id_rsa
4、检查集群
[root@k8s-master ~]# tiup cluster display tidb01
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
172.16.5.122:8250 pump 172.16.5.122 8250 linux/x86_64 Up /home/tidb/tidb/v4.0.12/tidb-data/pump-8250 /home/tidb/tidb/v4.0.12/tidb-deploy/pump-8250
3.5、配置和验证
1、启动pump节点和drainer节点后,需要开启Tidb数据库的binlog日志。
开启binlog日志可以使用tiup cluster edit-config命令编辑系统变量binlog.enable:true和binlog.ignore-error: true。
YAML
[root@k8s-master ~]# tiup cluster edit-config tidb01
server_configs:
tidb:
binlog.enable: true
binlog.ignore-error: true
2、重新加载改配置文件
C#
tiup cluster edit-config tidb01
3、登陆tidb进行验证和查看pump和drainer状态
Gherkin
MySQL [mysql]> show variables like "log_bin";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin | 1 |
+---------------+-------+
1 row in set (0.01 sec)
MySQL [mysql]> show pump status;
+-------------------+-------------------+--------+--------------------+---------------------+
| NodeID | Address | State | Max_Commit_Ts | Update_Time |
+-------------------+-------------------+--------+--------------------+---------------------+
| 172.16.5.122:8250 | 172.16.5.122:8250 | online | 433329819276804097 | 2022-05-20 12:47:57 |
+-------------------+-------------------+--------+--------------------+---------------------+
1 row in set (0.01 sec)
MySQL [mysql]> show drainer status;
+-------------------+-------------------+--------+--------------------+---------------------+
| NodeID | Address | State | Max_Commit_Ts | Update_Time |
+-------------------+-------------------+--------+--------------------+---------------------+
| 172.16.5.146:8249 | 172.16.5.146:8249 | online | 433329821636100097 | 2022-05-20 12:48:06 |
+-------------------+-------------------+--------+--------------------+---------------------+
1 row in set (0.00 sec)
从上边的查询会看到pump和drainer的状态为online
3.6、同步复制tidb数据库到mysql
1、登陆tidb数据库,查询test数据库的表T
Gherkin
MySQL [(none)]> use testlqb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MySQL [testlqb]> select * from T;
+----+-------+
| id | name |
+----+-------+
| 1 | tom |
| 2 | Jack |
| 3 | Frank |
+----+-------+
2、登陆msyql数据库查询该表
Gherkin
MySQL [testlqb]> select * from T;
+----+-------+
| id | name |
+----+-------+
| 1 | tom |
| 2 | Jack |
| 3 | Frank |
+----+-------+
3 rows in set (0.00 sec)
3、登陆tidb数据库,并插入数据
Gherkin
MySQL [testlqb]> insert into T select 4,tony;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0
MySQL [testlqb]> select * from T;
+----+-------+
| id | name |
+----+-------+
| 1 | tom |
| 2 | Jack |
| 3 | Frank |
| 4 | tony |
+----+-------+
4 rows in set (0.00 sec)
MySQL [testlqb]> insert into T select 5,lqb;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0
4、登陆mysql数据库查看数据同步
Gherkin
MySQL [testlqb]> select * from T;
+----+-------+
| id | name |
+----+-------+
| 1 | tom |
| 2 | Jack |
| 3 | Frank |
| 4 | tony |
| 5 | lqb |
+----+-------+
5 rows in set (0.00 sec)
备注,在源端从其他数据库复制一张表t1,在目的端mysql也能正常查询。数据也已经同步
Gherkin
MySQL [testlqb]> show tables;
+-------------------+
| Tables_in_testlqb |
+-------------------+
| T |
| t1 |
+-------------------+
2 rows in set (0.00 sec)
备注:drainer通过tiup停止和查询
Apache
tiup cluster stop tidb01 -N 172.16.5.146:8249
tiup cluster start tidb01 -N 172.16.5.146:8249
四、通过binlogctl工具进行管理
1、简介
Binlog Control(以下简称 binlogctl)是 TiDB Binlog 的命令行工具,用于管理 TiDB Binlog 集群。
binlogctl 支持如下这些功能:
• 查看 Pump/Drainer 状态
• 暂停/下线 Pump/Drainer
• Pump/Drainer 异常状态处理
使用 binlogctl 的场景:
• 同步出现故障/检查运行情况,需要查看 Pump/Drainer 的状态
• 维护集群,需要暂停/下线 Pump/Drainer
• Pump/Drainer 异常退出,状态没有更新,或者状态不符合预期,对业务造成影响
2、下载
Groovy
wget https://download.pingcap.org/tidb-version-linux-amd64.tar.gz &&
wget https://download.pingcap.org/tidb-version-linux-amd64.sha256
3、使用
Prolog
1、查看pump和drainer状态
[root@k8s-master bin]# ./binlogctl -pd-urls=http://172.16.5.122:2379 -cmd drainers
[2022/05/20 17:18:31.286 +08:00] [INFO] [nodes.go:53] ["query node"] [type=drainer] [node="NodeID: 172.16.5.146:8249, Addr: 172.16.5.146:8249, State: online, MaxCommitTS: 433334074686570497, UpdateTime: 2022-05-20 17:18:29 +0800 CST"]
[root@k8s-master bin]# ./binlogctl -pd-urls=http://172.16.5.122:2379 -cmd pumps
[2022/05/20 17:18:37.281 +08:00] [INFO] [nodes.go:53] ["query node"] [type=pump] [node="NodeID: 172.16.5.122:8250, Addr: 172.16.5.122:8250, State: online, MaxCommitTS: 433334076259434497, UpdateTime: 2022-05-20 17:18:36 +0800 CST"]
2、暂停pump和drainer
[root@k8s-master bin]# ./binlogctl -pd-urls=http://172.16.5.122:2379 -cmd pause-drainer -node-id 172.16.5.146:8249
[2022/05/20 17:21:06.069 +08:00] [INFO] [nodes.go:123] ["Apply action on node success"] [action=pause] [NodeID=172.16.5.146:8249]
[root@k8s-master bin]# ./binlogctl -pd-urls=http://172.16.5.122:2379 -cmd drainers
[2022/05/20 17:21:21.295 +08:00] [INFO] [nodes.go:53] ["query node"] [type=drainer] [node="NodeID: 172.16.5.146:8249, Addr: 172.16.5.146:8249, State: paused, MaxCommitTS: 433334115607248897, UpdateTime: 2022-05-20 17:21:05 +0800 CST"]
暂停drainer的时候下游mysql数据不同步
通过tiup cluster start tidb01 -N 172.16.5.146:8249启动drainer,发现下游数据库同步了
cmd 说明 示例
pause-pump 暂停 Pump bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pause-pump -node-id ip-127-0-0-1:8250
pause-drainer 暂停 Drainer bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pause-drainer -node-id ip-127-0-0-1:8249
offline-pump 下线 Pump bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd offline-pump -node-id ip-127-0-0-1:8250
offline-drainer 下线 Drainer bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd offline-drainer -node-id ip-127-0-0-1:8249
以上是关于TiDB入门篇-数据物理备份和恢复的主要内容,如果未能解决你的问题,请参考以下文章
TIDB - 使用TiDB Lightning 迁移Mysql数据至TIDB中