orchestrator常用命令

Posted 2022-05-06 _雪辉_
tags:
篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了orchestrator常用命令相关的知识，希望对你有一定的参考价值。
orchestrator-client是客户端命令（/usr/local/orchestrator/resources/bin）
orchestrator是服务端命令(/usr/local/orchestrator)
orchestrator
#discover
用于发现实例以及该实例的主、从库信息，将获取到的信息写入后台数据库database_instance等相关表
/usr/local/orchestrator/orchestrator --config=/usr/local/orchestrator/orchestrator.conf.json -c discover -i ip:port --ignore-raft-setup

#forget
移除实例信息，即从database_instance表中删除相关记录
/usr/local/orchestrator/orchestrator --config=/usr/local/orchestrator/orchestrator.conf.json -c forget -i ip:port --ignore-raft-setup

#begin-maintenance
标记一个实例进入维护模式，在database_instance_maintenance表中插入记录
orchestrator -c begin-maintenance -i instance.to.lock.com --duration=3h --reason="load testing; do not disturb"

1.4、end-maintenance
标记一个实例退出维护模式，即更新 database_instance_maintenance 表中相关记录
orchestrator -c end-maintenance -i locked.instance.com

1.5、in-maintenance
查询实例是否处于维护模式，从表database_instance_maintenance中查询
orchestrator -c in-maintenance -i locked.instance.com

1.6、begin-downtime
标记一个实例进入下线模式，在database_instance_downtime表中插入记录
orchestrator -c begin-downtime -i instance.to.downtime.com --duration=3h --reason="dba handling; do not do recovery"

1.7、end-downtime
标记一个实例退出下线模式，在database_instance_downtime表中删除记录
orchestrator -c end-downtime -i downtimed.instance.com

2、 mysql实例信息查询命令
2.1、find
通过正则表达式搜索实例名
orchestrator -c find -pattern "backup.*us-east"

2.2、search
通过关键字匹配搜索实例名
orchestrator -c search -pattern "search string"

2.3、clusters
输出所有的MySQL集群名称，通过sql查询database_instance相关表获取
orchestrator -c clusters

2.4、clusters-alias
输出所有MySQL集群名称以及别名
orchestrator -c clusters-alias

2.5、all-clusters-masters
输出所有MySQL集群可写的主库信息
orchestrator -c all-clusters-masters

2.6、topology
输出实例所属集群的拓扑信息
orchestrator -c topology -i instance.belonging.to.a.topology.com

2.7、topology-tabulated
输出实例所属集群的拓扑信息，类似topology命令，输出格式稍有不同
orchestrator -c topology-tabulated -i instance.belonging.to.a.topology.com

2.8、all-instances
输出所有已知的实例
orchestrator -c all-instances

2.9、which-instance
输出实例的完整的信息
orchestrator -c which-instance -i instance.to.check.com

2.10、which-cluster
输出MySQL实例所属的集群名称
orchestrator -c which-cluster -i instance.to.check.com

2.11、which-cluster-domain
输出MySQL实例所属集群的域名
orchestrator -c which-cluster-domain -i instance.to.check.com

2.12、which-heuristic-domain-instance
给定一个集群域名，输出与其关联的可写的实例
orchestrator -c which-heuristic-domain-instance -alias some_alias

2.13、which-cluster-master
输出实例所属集群的主库信息
orchestrator -c which-cluster-master -i instance.to.check.com

2.14、which-cluster-instances
输出实例所属集群的所有实例信息
orchestrator -c which-cluster-instances -i instance.to.check.com

2.15、which-master
列出实例所属集群的主库信息，与which-cluster-master类似
orchestrator -c which-master -i a.known.replica.com

2.16、which-downtimed-instances
列出处于下线状态的实例
orchestrator -c which-downtimed-instances

2.17、which-replicas
输出实例的从库信息
orchestrator -c which-replicas -i a.known.instance.com

2.18、which-lost-in-recovery
输出处于下线状态，在故障恢复过程中丢失的实例
orchestrator -c which-lost-in-recovery

2.19、instance-status
输出实例的状态信息
orchestrator -c instance-status -i instance.to.investigate.com

2.20、get-cluster-heuristic-lag
输出实例所属集群的最大延迟信息
orchestrator -c get-cluster-heuristic-lag -i instance.that.is.part.of.cluster.com

3、 故障恢复命令
3.1、recover
主库故障切换，主库必须关闭，执行才有效果， -i 参数必须是已经关闭的主库， 新主库不需要指定，由orchestrator自己选择。
orchestrator -c recover -i dead.instance.com --debug

3.2、recover-lite
主库故障切换，与recover类似，简化的部分操作，更加轻量化。
orchestrator -c recover-lite -i dead.instance.com --debug

3.3、force-master-failover
不管主库是否正常，强制故障切换，切换后主库不关闭，新主库不需要指定，由orchestrator选择。这个操作比较危险，谨慎使用。
orchestrator -c force-master-failover

3.4、force-master-takeover
不管主库是否正常，强制主从切换，-i指定集群中任一实例，-d 指定新主库， 注意 切换后旧主库不会指向新主库，需要手动操作。
orchestrator -c force-master-takeover -i instance.in.relevant.cluster.com -d immediate.child.of.master.com

3.5、graceful-master-takeover
主从切换，旧主库会指向新主库，但是复制线程是停止的，需要人工手动执行start slave，恢复复制。
orchestrator -c graceful-master-takeover -i instance.in.relevant.cluster.com -d immediate.child.of.master.com

3.6、replication-analysis
根据已有的拓扑关系分析潜在的故障事件，分析结果输出格式不稳定，未来可能改变，建议不要使用该功能。
orchestrator -c replication-analysis

3.7、ack-all-recoveries/ack-cluster-recoveries/ack-instance-recoveries
确认已有的故障恢复，防止未来再次发生故障时，会阻塞故障切换
orchestrator -c ack-all-recoveries --reason="dba has taken taken necessary steps"
orchestrator -c ack-cluster-recoveries -i instance.in.a.cluster.com --reason="reson message"
orchestrator -c ack-instance-recoveries -i instance.that.failed.com --reason="reson message"

3.8、relocate
调整拓扑结构，-i 指定的实例更改为 -d 指定实例的从库。
orchestrator -c relocate -i replica.to.relocate.com -d instance.that.becomes.its.master

# 列出所有集群
/usr/local/orchestrator/resources/bin/orchestrator-client -c clusters-alias

# orchestrator-client -c clusters-alias
test2:3307,test
2. 发现指定实例：discover/async-discover

同步发现：

# orchestrator-client -c discover -i test1:3307
test1:3307
异步发现：适用于批量

# orchestrator-client -c async-discover -i test1:3307
:null
3. 忘记指定对象：forget/forget-cluster

忘记指定实例：

# orchestrator-client -c forget -i test1:3307
忘记指定集群：

# orchestrator-client -c forget-cluster -i test
4. 打印指定集群的拓扑：topology/topology-tabulated

普通返回：

# orchestrator-client -c topology -i test1:3307


# orchestrator-client -c topology-tabulated -i test1:3307

5. 查看使用哪个API：自己会选择出leader。which-api

# orchestrator-client -c which-api
test3:3000/api
也可以通过 http://192.168.163.133:3000/api/leader-check 查看。

6. 调用api请求，需要和 -path 参数一起：api..-path

# orchestrator-client -c api -path clusters
# orchestrator-client -c api -path leader-check

# orchestrator-client -c api -path status
7. 搜索实例：search

# orchestrator-client -c search -i test

8. 打印指定实例的主库：which-master 

# orchestrator-client -c which-master -i test1:3307
# orchestrator-client -c which-master -i test3:3307
# orchestrator-client -c which-master -i test2:3307 #自己本身是主库

9. 打印指定实例的从库：which-replicas 

# orchestrator-client -c which-replicas -i test2:3307

10. 打印指定实例的实例名：which-instance 

# orchestrator-client -c instance -i test1:3307
11. 打印指定主实例从库异常的列表：which-broken-replicas，模拟test3的复制异常：

# orchestrator-client -c which-broken-replicas -i test2:3307
12. 给出一个实例或则集群别名，打印出该实例所在集群下的所有其他实例。which-cluster-instances

# orchestrator-client -c which-cluster-instances -i test

root@test1:~# orchestrator-client -c which-cluster-instances -i test1:3307

13. 给出一个实例，打印该实的集群名称：默认是hostname:port。which-cluster 

# orchestrator-client -c which-cluster -i test1:3307

14. 打印出指定实例/集群名或则所有所在集群的可写实例，：which-cluster-master

指定实例：which-cluster-master

# orchestrator-client -c which-cluster-master -i test2:3307
# orchestrator-client -c which-cluster-master -i test
所有实例：all-clusters-masters，每个集群返回一个

# orchestrator-client -c all-clusters-masters
15. 打印出所有实例：all-instances

# orchestrator-client -c all-instances

16. 打印出集群中可以作为pt-online-schema-change操作的副本列表：which-cluster-osc-replicas 

~# orchestrator-client -c which-cluster-osc-replicas -i test

root@test1:~# orchestrator-client -c which-cluster-osc-replicas -i test2:3307

17. 打印出集群中可以作为pt-online-schema-change可以操作的健康的副本列表：which-cluster-osc-running-replicas

# orchestrator-client -c which-cluster-osc-running-replicas -i test

# orchestrator-client -c which-cluster-osc-running-replicas -i test1:3307

18. 打印出所有在维护（downtimed）的实例：downtimed

# orchestrator-client -c downtimed

19. 打印出进群中主的数据中心：dominant-dc

# orchestrator-client -c dominant-dc

20. 将集群的主提交到KV存储。submit-masters-to-kv-stores

# orchestrator-client -c submit-masters-to-kv-stores 

21. 迁移从库到另一个实例上：relocate


# orchestrator-client -c relocate -i test3:3307 -d test1:3307   #迁移test3:3307作为

查看
# orchestrator-client -c topology -i test2:3307

22. 迁移一个实例的所有从库到另一个实例上：relocate-replicas

# orchestrator-client -c relocate-replicas -i test1:3307 -d test2:3307   #迁移test1:3307下的所有从库到test2:3307下，并列出被迁移的从库的实例名
test3:3307
23. 将slave在拓扑上向上移动一级，对应web上的是在Classic Model下进行拖动：move-up
# orchestrator-client -c move-up -i test3:3307 -d test2:3307

24. 将slave在拓扑上向下移动一级（移到同级的下面），对应web上的是在Classic Model下进行拖动：move-below

# orchestrator-client -c move-below -i test3:3307 -d test1:3307


25. 将给定实例的所有从库在拓扑上向上移动一级，基于Classic Model模式：move-up-replicas

# orchestrator-client -c move-up-replicas -i test1:3307  

26. 创建主主复制，将给定实例直接和当前主库做成主主复制：make-co-master

# orchestrator-client -c make-co-master -i test1:3307
27.将实例转换为自己主人的主人，切换两个：take-master 

# orchestrator-client -c take-master -i test3:3307
28. 通过GTID移动副本，move-gtid：

通过orchestrator-client执行报错：

# orchestrator-client -c move-gtid -i test3:3307 -d test1:3307
parse error: Invalid numeric literal at line 1, column 9
parse error: Invalid numeric literal at line 1, column 9
parse error: Invalid numeric literal at line 1, column 9
通过orchestrator执行是没问题，需要添加--ignore-raft-setup参数：

# orchestrator -c move-gtid -i test3:3307 -d test2:3307 --ignore-raft-setup
test3:3307<test2:3307
29.通过GTID移动指定实例下的所有slaves到另一个实例，move-replicas-gtid

通过orchestrator-client执行报错：

# orchestrator-client -c move-replicas-gtid -i test3:3307 -d test1:3307
jq: error (at <stdin>:1): Cannot index string with string "Key"
通过orchestrator执行是没问题，需要添加--ignore-raft-setup参数： 

# ./orchestrator -c move-replicas-gtid -i test2:3307 -d test1:3307 --ignore-raft-setup
30. 将给定实例的同级slave，变更成他的slave，take-siblings

# orchestrator-client -c take-siblings -i test3:3307

31. 给指定实例打上标签，tag

# orchestrator-client -c tag -i test1:3307 --tag 'name=AAA'
32. 列出指定实例的标签，tags：

# orchestrator-client -c tags -i test1:3307
name=AAA 
33. 列出给定实例的标签值：tag-value

# orchestrator-client -c tag-value -i test1:3307 --tag "name"
AAA
34. 移除指定实例上的标签：untag

# orchestrator-client -c untag -i test1:3307 --tag "name=AAA"
35. 列出打过某个标签的实例，tagged：

# orchestrator-client -c tagged -t name

36. 标记指定实例进入停用模式，包括时间、操作人、和原因，begin-downtime：

# orchestrator-client -c begin-downtime -i test1:3307 -duration=10m -owner=zjy -reason 'test'
37. 移除指定实例的停用模式，end--downtime：

# orchestrator-client -c end-downtime -i test1:3307
38. 请求指定实例上的维护锁：拓扑更改需要将锁放在最小受影响的实例上，以避免在同一个实例上发生两个不协调的操作，begin-maintenance ：

# orchestrator-client -c begin-maintenance -i test1:3307 --reason "XXX"
锁默认10分钟后过期，有参数MaintenanceExpireMinutes。

39. 移除指定实例上的维护锁：end-maintenance

# orchestrator-client -c end-maintenance -i test1:3307
40. 设置提升规则，恢复时可以指定一个实例进行提升：register-candidate：需要和promotion-rule一起使用

# orchestrator-client -c register-candidate -i test3:3307 --promotion-rule prefer 
提升test3:3307的权重，如果进行Failover，会成为Master。

41. 指定实例执行停止复制：
# orchestrator-client -c stop-replica -i test2:3307
应用完relay log，在stop slave：stop-replica-nice

# orchestrator-client -c stop-replica-nice -i test2:3307
42.指定实例执行开启复制： start-replica 

# orchestrator-client -c start-replica -i test2:3307
43. 指定实例执行复制重启：restart-replica

# orchestrator-client -c restart-replica -i test2:3307
44.指定实例执行复制重置：reset-replica

# orchestrator-client -c reset-replica -i test2:3307
45.分离副本：非GTID修改binlog position，detach-replica ：

# orchestrator-client -c detach-replica -i test2:3307
46.恢复副本：reattach-replica 

# orchestrator-client -c reattach-replica  -i test2:3307 
47.分离副本：注释master_host来分离，detach-replica-master-host ：如Master_Host: //test1

# orchestrator-client -c detach-replica-master-host -i test2:3307
48. 恢复副本：reattach-replica-master-host

# orchestrator-client -c reattach-replica-master-host -i test2:3307
49. 跳过SQL线程的Query，如主键冲突，支持在GTID和非GTID下：skip-query 

# orchestrator-client -c skip-query -i test2:3307
50. 将错误的GTID事务当做空事务应用副本的主上：gtid-errant-inject-empty「web上的fix」

# orchestrator-client -c gtid-errant-inject-empty  -i test2:3307
test2:3307 
51.  通过RESET MASTER删除错误的GTID事务：gtid-errant-reset-master 

# orchestrator-client -c gtid-errant-reset-master  -i test2:3307
test2:3307
52. 设置半同步相关的参数:

orchestrator-client -c $variable -i test1:3307
    enable-semi-sync-master      主上执行开启半同步
    disable-semi-sync-master      主上执行关闭半同步
    enable-semi-sync-replica       从上执行开启半同步
    disable-semi-sync-replica      从上执行关闭半同步
53. 执行需要stop/start slave配合的SQL：restart-replica-statements


# orchestrator-client -c restart-replica-statements -i test3:3307 -query "change master to auto_position=1" | jq .[] -r 
stop slave io_thread;
stop slave sql_thread;
change master to auto_position=1;
start slave sql_thread;
start slave io_thread;

# orchestrator-client -c restart-replica-statements -i test3:3307 -query "change master to master_auto_position=1" | jq .[] -r  |  mysql -urep -p -htest3 -P3307

54.根据复制规则检查实例是否可以从另一个实例复制(GTID和非GTID）：

非GTID，can-replicate-from： 

# orchestrator-client -c can-replicate-from -i test3:3307 -d test1:3307

# orchestrator-client -c can-replicate-from-gtid -i test3:3307 -d test1:3307
55. 检查指定实例是否在复制：is-replicating 

#有返回在复制
# orchestrator-client -c is-replicating -i test2:3307

#没有返回，不在复制
# orchestrator-client -c is-replicating -i test1:3307
56.检查指定实例的IO和SQL限制是否都停止： 

# orchestrator-client -c is-replicating -i test2:3307
57.将指定实例设置为只读，通过SET GLOBAL read_only=1，set-read-only：

# orchestrator-client -c set-read-only -i test2:3307
58.将指定实例设置为读写，通过SET GLOBAL read_only=0，set-writeable

# orchestrator-client -c set-writeable -i test2:3307
59. 轮询指定实例的binary log，flush-binary-logs

# orchestrator-client -c flush-binary-logs -i test1:3307

60. 手动执行恢复，指定一个死机的实例，recover：

# orchestrator-client -c recover -i test2:3307


61. 优雅的进行主和指定从切换，graceful-master-takeover：

# orchestrator-client -c graceful-master-takeover -a test1:3307 -d test2:3307

62. 手动强制执行恢复，即使orch没有发现问题，force-master-failover：转移之后老主独立，需要手动加入到集群。

# orchestrator-client -c force-master-failover -i test1:3307
test3:3307
63.强行丢弃master并指定的一个实例，force-master-takeover：老主(test1)独立，指定从(test2)提升为master

# orchestrator-client -c force-master-takeover -i test1:3307 -d test2:3307
64. 确认集群恢复理由，在web上的Audit->Recovery->Acknowledged 按钮确认，/ack-all-recoveries 
确认指定集群：ack-cluster-recoveries
# orchestrator-client -c ack-cluster-recoveries  -i test2:3307 -reason=''
确认所有集群：ack-all-recoveries 
# orchestrator-client -c ack-all-recoveries  -reason='OOOPPP'

65.检查、禁止、开启orchestrator执行全局恢复：
检查：check-global-recoveries

# orchestrator-client -c check-global-recoveries
enabled
禁止：disable-global-recoveries
# orchestrator-client -c disable-global-recoveries
开启：enable--global-recoveries
# orchestrator-client -c enable-global-recoveries

66. 检查分析复制拓扑中存在的问题：replication-analysis
# orchestrator-client -c replication-analysis

67. raft检测：leader查看、健康监测、迁移leader：
查看leader节点
# orchestrator-client -c raft-leader

健康监测
# orchestrator-client -c raft-health

leader 主机名
# orchestrator-client -c  raft-leader-hostname 

指定主机选举leader
# orchestrator-client -c raft-elect-leader -hostname test3

68.伪GTID相关参数：
match      #使用Pseudo-GTID指定一个从匹配到指定的另一个（目标）实例下
以上是关于orchestrator常用命令的主要内容，如果未能解决你的问题，请参考以下文章
orchestrator常用命令
从 Azure Bot App Service Console 运行 Bot Framework Orchestrator 命令时出现问题
MySQL高可用工具--orchestrator
[gulp]Cannot find module 'orchestrator' 报错
[gulp]Cannot find module 'orchestrator'
MySQL高可用工具Orchestrator raft模式部署