拓扑介绍:
192.168.43.141 Master
192.168.43.142 Slave01
192.168.43.143 Slave02
192.168.43.144 Mha-manager
vip (192.168.43.145) vip
Master(主) --> Slave01(从、备主)
| <- Mha-manager(monitor-MHA)
|- - -> Slave02(从)
OS : CentOS release 6.10 (Final) Mini
Perl : v5.10.1 x86_64-linux-thread-multi(系统版本自带)
Gateway : 192.168.43.1
MySQL : mysql-5.7.31-1.el6.x86_64.rpm-bundle.tar
MHA : mha4mysql-manager-0.56-0.el6.noarch.rpm
mha4mysql-node-0.56-0.el6.noarch.rpm
MHA官方github : https://github.com/yoshinorim/mha4mysql-manager/wiki
MySQL主从复制模式: 我们选用半同步机制
一、系统调整(所有节点)
1.修改/etc/hosts
[root@mha-manager ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.43.141 Master
192.168.43.142 Slave01
192.168.43.143 Slave02
192.168.43.144 Mha-manager
2.关闭防火墙及SELINUX
[root@mha-manager ~]# /etc/init.d/iptables stop
[root@mha-manager ~]# getenforce
Disabled
3.安装基本工具
[root@mha-manager ~]# yum install ntp vim wget net-tools mlocate openssh-* cc gcc make -y
二、安装MHA所需依赖(所有节点)
1.配置SSH免密登录
[root@ ~]# ssh-keygen -t rsa
[root@ ~]# for i in Master Slave01 Slave02 Mha-manager;do ssh-copy-id $i;done
2.测试免密登录
[root@ ~]# for i in Master Slave01 Slave02 Mha-manager;do ssh $i hostname;done
master
slave01
slave02
mha-manager
3.配置NTP服务器
[root@slave01 opt]# yum install ntp -y
[root@slave01 opt]# cat /etc/ntp.conf | grep 'server'
server ntp.aliyun.com
server ntp2.aliyun.com
[root@slave01 opt]# ntpdate ntp.aliyun.com
20 Aug 13:34:06 ntpdate[31767]: step time server 203.107.6.88 offset -1920.602438 sec
[root@slave01 opt]# date
Thu Aug 20 13:34:09 CST 2020
[root@slave01 opt]# /etc/init.d/ntpd start
Starting ntpd: [ OK ]
[root@slave01 opt]# chkconfig ntpd on
[root@slave01 opt]#
三、基于半同步的Mysql主从复制的安装
(一)半同步插件的安装与开启
0.安装mysql(所有数据库节点)
[root@slave01 opt]# tar -xf mysql-5.7.31-1.el6.x86_64.rpm-bundle.tar
[root@slave01 opt]# yum install -y mysql-community-*
1.启动MySQL,改修数据库root密码(所有数据库节点)
[root@master ~]# /etc/init.d/mysqld start
Initializing MySQL database: [ OK ]
Starting mysqld: [ OK ]
mysql> alter user 'root'@'localhost' identified by 'Test.123';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
2.安装半同步插件(所有数据库节点)
(1)查找半同步所需插件
mysql> show variables like '%plugin_dir%'; #查找插件所在目录(每台DB服务器可能不一样)
+---------------+--------------------------+
| Variable_name | Value |
+---------------+--------------------------+
| plugin_dir | /usr/lib64/mysql/plugin/ |
+---------------+--------------------------+
1 row in set (0.00 sec)
mysql> show variables like '%have_dynamic%'; #检查是否支持动态检测
+----------------------+-------+
| Variable_name | Value |
+----------------------+-------+
| have_dynamic_loading | YES |
+----------------------+-------+
1 row in set (0.01 sec)
mysql> system ls /usr/lib64/mysql/plugin/ | egrep 'master|slave'
semisync_master.so
semisync_slave.so
(2)安装插件
mysql> install plugin rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.00 sec)
mysql> install plugin rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)
(3)检查Plugin是否已正确安装
mysql> show plugins;
+----------------------------+----------+--------------------+----------------------+---------+
| Name | Status | Type | Library | License |
+----------------------------+----------+--------------------+----------------------+---------+
| FEDERATED | DISABLED | STORAGE ENGINE | NULL | GPL |
| ngram | ACTIVE | FTPARSER | NULL | GPL |
| validate_password | ACTIVE | VALIDATE PASSWORD | validate_password.so | GPL |
| rpl_semi_sync_master | ACTIVE | REPLICATION | semisync_master.so | GPL |
| rpl_semi_sync_slave | ACTIVE | REPLICATION | semisync_slave.so | GPL |
+----------------------------+----------+--------------------+----------------------+---------+
47 rows in set (0.00 sec)
mysql> show variables like '%rpl_semi_sync%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | OFF |
| rpl_semi_sync_master_timeout | 10000 |
| rpl_semi_sync_master_trace_level | 32 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
| rpl_semi_sync_slave_enabled | OFF |
| rpl_semi_sync_slave_trace_level | 32 |
+-------------------------------------------+------------+
8 rows in set (0.01 sec)
mysql>
3.MySQL配置文件的修改
Master配置:
[root@master ~]# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
server_id=1
log-bin=mysql-bin
binlog_format=mixed
rpl_semi_sync_master_enabled=1
rpl_semi_sync_master_timeout=1000
rpl_semi_sync_slave_enabled=1
relay_log_purge=0
relay-log = relay-bin
relay-log-index = slave-relay-bin.index
default-storage-engine = INNODB
character-set-server = utf8
collation-server = utf8_general_ci
relay_log_purge = 0
[client]
default-character-set = utf8
Slave01配置:
[root@slave01 ~]# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
server_id=2
log-bin=mysql-bin
binlog_format=mixed
rpl_semi_sync_master_enabled=1
rpl_semi_sync_master_timeout=1000
rpl_semi_sync_slave_enabled=1
relay_log_purge=0
relay-log = relay-bin
relay-log-index = slave-relay-bin.index
default-storage-engine = INNODB
character-set-server = utf8
collation-server = utf8_general_ci
relay_log_purge = 0
Slave02配置:
#由于slave02只是用来做一个slave主机,所以无需开启master的半同步
[root@slave02 ~]# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
server_id=3
log-bin = mysql-bin
relay-log = relay-bin
relay-log-index = slave-relay-bin.index
read_only = 1
rpl_semi_sync_slave_enabled=1
default-storage-engine = INNODB
character-set-server = utf8
collation-server = utf8_general_ci
relay_log_purge = 0
4.查看半同步状态是否开启
[root@ ~]# /etc/init.d/mysqld restart
Initializing MySQL database: [ OK ]
Starting mysqld: [ OK ]
mysql> show variables like '%rpl_semi_sync%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | ON |
| rpl_semi_sync_master_timeout | 1000 |
| rpl_semi_sync_master_trace_level | 32 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
| rpl_semi_sync_slave_enabled | ON |
| rpl_semi_sync_slave_trace_level | 32 |
+-------------------------------------------+------------+
8 rows in set (0.01 sec)
(二)配置主从复制
1.创建用户(Master上配置)
(1)插件主从复制同步用户
mysql> grant replication slave on *.* to mharep@'192.168.43.%' identified by 'Test.123';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> show warnings\G;
*************************** 1. row ***************************
Level: Warning
Code: 1287
Message: Using GRANT statement to modify existing user's properties other than privileges is deprecated and will be removed in future release. Use ALTER USER statement for this operation.
1 row in set (0.00 sec)
(2)创建用户mha的manager监控的用户
mysql> grant all on *.* to manager@'192.168.43.%' identified by 'Test.123';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> show warnings\G;
*************************** 1. row ***************************
Level: Warning
Code: 1287
Message: Using GRANT for creating new user is deprecated and will be removed in future release. Create new user with CREATE USER statement.
1 row in set (0.00 sec)
(3)查看master二进制相关的信息
mysql> show master status\G;
*************************** 1. row ***************************
File: mysql-bin.000001
Position: 1040
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set:
1 row in set (0.00 sec)
2.创建用户(Slave01上配置)
(1)创建用于同步的用户
mysql> grant replication slave on *.* to mharep@'192.168.43.%' identified by 'Test.123';
Query OK, 0 rows affected, 1 warning (1.00 sec)
(2)创建用户mha的manager监控的用户
mysql> grant all on *.* to manager@'192.168.43.%' identified by 'Test.123';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql>
3.创建用户(Slave02上配置)
由于slave02无需做备主,所以不用创建用于同步数据的账户
#创建manager监控账号
mysql> grant all on *.* to manager@'192.168.43.%' identified by 'Test.123';
Query OK, 0 rows affected, 1 warning (0.01 sec)
4.分别在Slave01与Slave02上执行:
Master上查日志文件和位置:
mysql> show master status\G;
*************************** 1. row ***************************
File: mysql-bin.000001
Position: 1040
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set:
1 row in set (0.00 sec)
Slave01上执行:
mysql> change master to master_host='192.168.43.141', master_port=3306, master_user='mharep',master_password='Test.123', master_log_file = 'mysql-bin.000001', master_log_pos=1040;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.02 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.43.141
Master_User: mharep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 1040
Relay_Log_File: relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1040
Relay_Log_Space: 521
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: 6882c9b1-e23a-11ea-b7a9-000c29b26880
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
Slave02上执行:
mysql> change master to master_host='192.168.43.141', master_port=3306, master_user='mharep',master_password='Test.123', master_log_file = 'mysql-bin.000001', master_log_pos=1040;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.06 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.43.141
Master_User: mharep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 1040
Relay_Log_File: relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1040
Relay_Log_Space: 521
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: 6882c9b1-e23a-11ea-b7a9-000c29b26880
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
MASTER上查看:
mysql> show status like '%rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | 2 |
| Rpl_semi_sync_master_net_avg_wait_time | 0 |
| Rpl_semi_sync_master_net_wait_time | 0 |
| Rpl_semi_sync_master_net_waits | 0 |
| Rpl_semi_sync_master_no_times | 1 |
| Rpl_semi_sync_master_no_tx | 3 |
| Rpl_semi_sync_master_status | ON |
| Rpl_semi_sync_master_timefunc_failures | 0 |
| Rpl_semi_sync_master_tx_avg_wait_time | 0 |
| Rpl_semi_sync_master_tx_wait_time | 0 |
| Rpl_semi_sync_master_tx_waits | 0 |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0 |
| Rpl_semi_sync_master_wait_sessions | 0 |
| Rpl_semi_sync_master_yes_tx | 0 |
| Rpl_semi_sync_slave_status | OFF |
+--------------------------------------------+-------+
15 rows in set (0.02 sec)
四、安装MHA
1.安装mha4mysql-node(所有节点)
以下以slave01为示例:
[root@slave01 mha4mysql]# yum install -y perl-DBD-MySQL
[root@slave01 mha4mysql]# pwd
/opt/mha4mysql
[root@slave01 mha4mysql]# ls
mha4mysql-manager-0.56-0.el6.noarch.rpm mha4mysql-node-0.56-0.el6.noarch.rpm
[root@slave01 mha4mysql]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
[root@slave01 mha4mysql]# rpm -qa | grep mha
mha4mysql-node-0.56-0.el6.noarch
2.安装mha4mysql-manager(manager节点)
[root@mha-manager opt]# yum install -y perl-DBD-MySQL
[root@mha-manager opt]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
Preparing... ########################################### [100%]
1:mha4mysql-node ########################################### [100%]
[root@mha-manager opt]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
error: Failed dependencies:
perl(Log::Dispatch) is needed by mha4mysql-manager-0.56-0.el6.noarch
perl(Log::Dispatch::File) is needed by mha4mysql-manager-0.56-0.el6.noarch
perl(Log::Dispatch::Screen) is needed by mha4mysql-manager-0.56-0.el6.noarch
perl(Parallel::ForkManager) is needed by mha4mysql-manager-0.56-0.el6.noarch
[root@mha-manager opt]# yum install epel-release.noarch -y
[root@mha-manager opt]# yum install -y perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager install perl-Time-HiRes
[root@mha-manager opt]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
Preparing... ########################################### [100%]
1:mha4mysql-manager ########################################### [100%]
[root@mha-manager opt]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
[root@mha-manager mha4mysql]# rpm -qa | grep mha
mha4mysql-node-0.56-0.el6.noarch
mha4mysql-manager-0.56-0.el6.noarch
3.处理mha4mysql-manager配置文件(manager节点)
[root@mha-manager opt]# mkdir /etc/masterha
[root@mha-manager opt]# mkdir -p /masterha/app1
[root@mha-manager opt]# mkdir /scripts
[root@mha-manager opt]# ls
mha4mysql-manager-0.56-0.el6.noarch.rpm mha4mysql-manager-0.56.tar.gz mha4mysql-node-0.56-0.el6.noarch.rpm
[root@mha-manager opt]# tar -xf mha4mysql-manager-0.56.tar.gz
[root@mha-manager opt]# ls
mha4mysql-manager-0.56 mha4mysql-manager-0.56.tar.gz
mha4mysql-manager-0.56-0.el6.noarch.rpm mha4mysql-node-0.56-0.el6.noarch.rpm
[root@mha-manager opt]# cd mha4mysql-manager-0.56
[root@mha-manager mha4mysql-manager-0.56]# pwd
/opt/mha4mysql-manager-0.56
[root@mha-manager mha4mysql-manager-0.56]# ls
AUTHORS bin COPYING debian inc lib Makefile.PL MANIFEST META.yml README rpm samples t tests
[root@mha-manager mha4mysql-manager-0.56]# cp samples/conf/* /etc/masterha/
[root@mha-manager mha4mysql-manager-0.56]# cp samples/scripts/* /scripts/
[root@mha-manager mha4mysql-manager-0.56]# > /etc/masterha/masterha_default.cnf
修改mha-manager配置文件:
[root@mha-manager mha4mysql-manager-0.56]# cat /etc/masterha/app1.cnf
[server default]
manager_workdir=/masterha/app1 #指定工作目录
manager_log=/masterha/app1/manager.log #指定日志文件
user=manager #指定manager管理数据库节点所使用的用户名
password=Test.123 #对应的是上面用户的密码
ssh_user=root #指定配置了ssh免密登录的系统用户
repl_user=mharep #指定用于同步数据的用户名
repl_password=Test.123 #对应的是上面同步用户的 密码
ping_interval=1 #设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应时自动进行切换
master_ip_failover_script=/scripts/master_ip_failover # 该脚本文件请见附录(文末)
secondary_check_script= masterha_secondary_check -s 192.168.43.141 -s 192.168.43.142 -s 192.168.43.143
[server1]
hostname=192.168.43.141
port=3306
master_binlog_dir=/var/lib/mysql #指定master保存二进制日志的路径,以便MHA可以找到master的日志
candidate_master=1 #设置为候选master,设置该参数后,发生主从切换以后将会将此库提升为主库
[server2]
hostname=192.168.43.142
port=3306
master_binlog_dir=/var/lib/mysql
candidate_master=1 #设置为候选master
[server3]
hostname=192.168.43.143
port=3306
master_binlog_dir=/var/lib/mysql
no_master=1 #设置的不为备选主库
4.启动前测试
(1)验证SSH有效性
[root@mha-manager mha4mysql-manager-0.56]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Can't locate MHA/SSHCheck.pm in @INC (you may need to install the MHA::SSHCheck module) (@INC contains: /usr/local/perl/lib/site_perl/5.20.3/x86_64-linux /usr/local/perl/lib/site_perl/5.20.3 /usr/local/perl/lib/5.20.3/x86_64-linux /usr/local/perl/lib/5.20.3 .) at /usr/bin/masterha_check_ssh line 25.
BEGIN failed--compilation aborted at /usr/bin/masterha_check_ssh line 25.
[root@mha-manager mha4mysql-manager-0.56]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Thu Aug 20 05:18:17 2020 - [info] Reading default configuration from /etc/masterha/masterha_default.cnf..
Thu Aug 20 05:18:17 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Aug 20 05:18:17 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Aug 20 05:18:17 2020 - [info] Starting SSH connection tests..
Thu Aug 20 05:18:18 2020 - [debug]
Thu Aug 20 05:18:17 2020 - [debug] Connecting via SSH from root@192.168.43.141(192.168.43.141:22) to root@192.168.43.142(192.168.43.142:22)..
Thu Aug 20 05:18:17 2020 - [debug] ok.
Thu Aug 20 05:18:17 2020 - [debug] Connecting via SSH from root@192.168.43.141(192.168.43.141:22) to root@192.168.43.143(192.168.43.143:22)..
Thu Aug 20 05:18:18 2020 - [debug] ok.
Thu Aug 20 05:18:18 2020 - [debug]
Thu Aug 20 05:18:18 2020 - [debug] Connecting via SSH from root@192.168.43.142(192.168.43.142:22) to root@192.168.43.141(192.168.43.141:22)..
Thu Aug 20 05:18:18 2020 - [debug] ok.
Thu Aug 20 05:18:18 2020 - [debug] Connecting via SSH from root@192.168.43.142(192.168.43.142:22) to root@192.168.43.143(192.168.43.143:22)..
Thu Aug 20 05:18:18 2020 - [debug] ok.
Thu Aug 20 05:18:19 2020 - [debug]
Thu Aug 20 05:18:18 2020 - [debug] Connecting via SSH from root@192.168.43.143(192.168.43.143:22) to root@192.168.43.141(192.168.43.141:22)..
Thu Aug 20 05:18:18 2020 - [debug] ok.
Thu Aug 20 05:18:18 2020 - [debug] Connecting via SSH from root@192.168.43.143(192.168.43.143:22) to root@192.168.43.142(192.168.43.142:22)..
Thu Aug 20 05:18:19 2020 - [debug] ok.
Thu Aug 20 05:18:19 2020 - [info] All SSH connection tests passed successfully.
[root@mha-manager mha4mysql-manager-0.56]#
(2)验证集群复制的有效性(MySQL必须都启动)
[root@mha-manager mha4mysql-manager-0.56]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu Aug 20 15:50:32 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug 20 15:50:32 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Aug 20 15:50:32 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Aug 20 15:50:32 2020 - [info] MHA::MasterMonitor version 0.56.
Thu Aug 20 15:50:33 2020 - [info] GTID failover mode = 0
Thu Aug 20 15:50:33 2020 - [info] Dead Servers:
Thu Aug 20 15:50:33 2020 - [info] Alive Servers:
Thu Aug 20 15:50:33 2020 - [info] 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 15:50:33 2020 - [info] 192.168.43.142(192.168.43.142:3306)
Thu Aug 20 15:50:33 2020 - [info] 192.168.43.143(192.168.43.143:3306)
Thu Aug 20 15:50:33 2020 - [info] Alive Slaves:
Thu Aug 20 15:50:33 2020 - [info] 192.168.43.142(192.168.43.142:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 15:50:33 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 15:50:33 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Aug 20 15:50:33 2020 - [info] 192.168.43.143(192.168.43.143:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 15:50:33 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 15:50:33 2020 - [info] Not candidate for the new Master (no_master is set)
Thu Aug 20 15:50:33 2020 - [info] Current Alive Master: 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 15:50:33 2020 - [info] Checking slave configurations..
Thu Aug 20 15:50:33 2020 - [info] read_only=1 is not set on slave 192.168.43.142(192.168.43.142:3306).
Thu Aug 20 15:50:33 2020 - [warning] relay_log_purge=0 is not set on slave 192.168.43.143(192.168.43.143:3306).
Thu Aug 20 15:50:33 2020 - [info] Checking replication filtering settings..
Thu Aug 20 15:50:33 2020 - [info] binlog_do_db= , binlog_ignore_db=
Thu Aug 20 15:50:33 2020 - [info] Replication filtering check ok.
Thu Aug 20 15:50:33 2020 - [info] GTID (with auto-pos) is not supported
Thu Aug 20 15:50:33 2020 - [info] Starting SSH connection tests..
Thu Aug 20 15:50:34 2020 - [info] All SSH connection tests passed successfully.
Thu Aug 20 15:50:34 2020 - [info] Checking MHA Node version..
Thu Aug 20 15:50:35 2020 - [info] Version check ok.
Thu Aug 20 15:50:35 2020 - [info] Checking SSH publickey authentication settings on the current master..
Thu Aug 20 15:50:35 2020 - [info] HealthCheck: SSH to 192.168.43.141 is reachable.
Thu Aug 20 15:50:35 2020 - [info] Master MHA Node version is 0.56.
Thu Aug 20 15:50:35 2020 - [info] Checking recovery script configurations on 192.168.43.141(192.168.43.141:3306)..
Thu Aug 20 15:50:35 2020 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000001
Thu Aug 20 15:50:35 2020 - [info] Connecting to root@192.168.43.141(192.168.43.141:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to mysql-bin.000001
Thu Aug 20 15:50:35 2020 - [info] Binlog setting check done.
Thu Aug 20 15:50:35 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Thu Aug 20 15:50:35 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='manager' --slave_host=192.168.43.142 --slave_ip=192.168.43.142 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.31-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Thu Aug 20 15:50:35 2020 - [info] Connecting to root@192.168.43.142(192.168.43.142:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to relay-bin.000004
Temporary relay log file is /var/lib/mysql/relay-bin.000004
Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Thu Aug 20 15:50:35 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='manager' --slave_host=192.168.43.143 --slave_ip=192.168.43.143 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.31-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Thu Aug 20 15:50:35 2020 - [info] Connecting to root@192.168.43.143(192.168.43.143:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to relay-bin.000004
Temporary relay log file is /var/lib/mysql/relay-bin.000004
Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Thu Aug 20 15:50:36 2020 - [info] Slaves settings check done.
Thu Aug 20 15:50:36 2020 - [info]
192.168.43.141(192.168.43.141:3306) (current master)
+--192.168.43.142(192.168.43.142:3306)
+--192.168.43.143(192.168.43.143:3306)
Thu Aug 20 15:50:36 2020 - [info] Checking replication health on 192.168.43.142..
Thu Aug 20 15:50:36 2020 - [info] ok.
Thu Aug 20 15:50:36 2020 - [info] Checking replication health on 192.168.43.143..
Thu Aug 20 15:50:36 2020 - [info] ok.
Thu Aug 20 15:50:36 2020 - [warning] master_ip_failover_script is not defined.
Thu Aug 20 15:50:36 2020 - [warning] shutdown_script is not defined.
Thu Aug 20 15:50:36 2020 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
5.启动manager
[root@mha-manager masterha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf &> /var/log/mha_manager.log
五、管理MHA
1.FailOver的配置
MHA管理VIP有两种方案,一种是使用Keepalived,另一种是自己写命令实现增删VIP,由于Keepalived容易受到网络波动造成VIP切换,而且无法在多实例机器上使用,所以建议写脚本管理VIP。
使用脚本管理 VIP 不会自动设置 VIP,所以先手动在 Master 设置 VIP
这里我们先在Master上设置我们的vip:192.168.43.145
ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255
[root@master opt]# ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255
[root@master opt]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:b2:68:80 brd ff:ff:ff:ff:ff:ff
inet 192.168.43.141/24 brd 192.168.43.255 scope global eth0
inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1
inet6 fe80::20c:29ff:feb2:6880/64 scope link
valid_lft forever preferred_lft forever
[root@master opt]#
2.编写master_ip_failover(见附录)
3.分析现在的拓扑结构以便后续测试的理解:
[root@slave01 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G' | grep Master_Host
mysql: [Warning] Using a password on the command line interface can be insecure.
Master_Host: 192.168.43.141
[root@slave02 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G' | grep Master_Host
mysql: [Warning] Using a password on the command line interface can be insecure.
Master_Host: 192.168.43.141
[root@slave02 opt]#
可以看出当前的拓扑结构:
Master -> (192.168.43.141)
Slave -> (192.168.43.142,192.168.43.143)
vip -> 192.168.43.145(现在master上,failover后应该漂移到slave01上)
Mha-manager -> 192.168.43.144
4.开始模拟Master切换过程
(1)确保Mha-manager上的监控进程正常运行
[root@mha-manager masterha]# ps -ef | grep perl
root 2514 1604 1 18:32 pts/0 00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha/app1.cnf
root 2573 1604 0 18:33 pts/0 00:00:00 grep perl
(2)查看Master上的IP
[root@master opt]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:b2:68:80 brd ff:ff:ff:ff:ff:ff
inet 192.168.43.141/24 brd 192.168.43.255 scope global eth0
inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1
inet6 fe80::20c:29ff:feb2:6880/64 scope link
valid_lft forever preferred_lft forever
(3)关闭Master上的mysql服务
[root@master opt]# /etc/init.d/mysqld stop
Stopping mysqld: [ OK ]
[root@master opt]#
(4)观察IP漂移情况
[root@master opt]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:b2:68:80 brd ff:ff:ff:ff:ff:ff
inet 192.168.43.141/24 brd 192.168.43.255 scope global eth0
inet6 fe80::20c:29ff:feb2:6880/64 scope link
valid_lft forever preferred_lft forever
[root@slave01 opt]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:fb:79:18 brd ff:ff:ff:ff:ff:ff
inet 192.168.43.142/24 brd 192.168.43.255 scope global eth0
inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1
inet6 fe80::20c:29ff:fefb:7918/64 scope link
valid_lft forever preferred_lft forever
可以观察:vip已经漂移到slave01上
(5)观察Master角色的转移
[root@slave01 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G' | grep Master_Host
mysql: [Warning] Using a password on the command line interface can be insecure.
[root@slave01 opt]#
[root@slave02 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G' | grep Master_Host
mysql: [Warning] Using a password on the command line interface can be insecure.
Master_Host: 192.168.43.142
可以观察:Master角色迁移到slave01上
(6)分析日志输出(下面为切换的Mha-manager上的整个日志输出)
Thu Aug 20 18:37:14 2020 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Thu Aug 20 18:37:14 2020 - [info] Executing secondary network check script: masterha_secondary_check -s 192.168.43.141 -s 192.168.43.142 -s 192.168.43.143 --user=root --master_host=192.168.43.141 --master_ip=192.168.43.141 --master_port=3306 --master_user=manager --master_password=Test.123 --ping_type=SELECT
Thu Aug 20 18:37:14 2020 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --binlog_prefix=mysql-bin
Thu Aug 20 18:37:14 2020 - [info] HealthCheck: SSH to 192.168.43.141 is reachable.
Monitoring server 192.168.43.141 is reachable, Master is not reachable from 192.168.43.141. OK.
Monitoring server 192.168.43.142 is reachable, Master is not reachable from 192.168.43.142. OK.
Monitoring server 192.168.43.143 is reachable, Master is not reachable from 192.168.43.143. OK.
Thu Aug 20 18:37:14 2020 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Thu Aug 20 18:37:15 2020 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Aug 20 18:37:15 2020 - [warning] Connection failed 2 time(s)..
Thu Aug 20 18:37:16 2020 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Aug 20 18:37:16 2020 - [warning] Connection failed 3 time(s)..
Thu Aug 20 18:37:17 2020 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Aug 20 18:37:17 2020 - [warning] Connection failed 4 time(s)..
Thu Aug 20 18:37:17 2020 - [warning] Master is not reachable from health checker!
Thu Aug 20 18:37:17 2020 - [warning] Master 192.168.43.141(192.168.43.141:3306) is not reachable!
Thu Aug 20 18:37:17 2020 - [warning] SSH is reachable.
Thu Aug 20 18:37:17 2020 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Thu Aug 20 18:37:17 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug 20 18:37:17 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Aug 20 18:37:17 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Aug 20 18:37:18 2020 - [info] GTID failover mode = 0
Thu Aug 20 18:37:18 2020 - [info] Dead Servers:
Thu Aug 20 18:37:18 2020 - [info] 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:18 2020 - [info] Alive Servers:
Thu Aug 20 18:37:18 2020 - [info] 192.168.43.142(192.168.43.142:3306)
Thu Aug 20 18:37:18 2020 - [info] 192.168.43.143(192.168.43.143:3306)
Thu Aug 20 18:37:18 2020 - [info] Alive Slaves:
Thu Aug 20 18:37:18 2020 - [info] 192.168.43.142(192.168.43.142:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:18 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:18 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Aug 20 18:37:18 2020 - [info] 192.168.43.143(192.168.43.143:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:18 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:18 2020 - [info] Not candidate for the new Master (no_master is set)
Thu Aug 20 18:37:18 2020 - [info] Checking slave configurations..
Thu Aug 20 18:37:18 2020 - [info] read_only=1 is not set on slave 192.168.43.142(192.168.43.142:3306).
Thu Aug 20 18:37:18 2020 - [warning] relay_log_purge=0 is not set on slave 192.168.43.143(192.168.43.143:3306).
Thu Aug 20 18:37:18 2020 - [info] Checking replication filtering settings..
Thu Aug 20 18:37:18 2020 - [info] Replication filtering check ok.
Thu Aug 20 18:37:18 2020 - [info] Master is down!
Thu Aug 20 18:37:18 2020 - [info] Terminating monitoring script.
Thu Aug 20 18:37:18 2020 - [info] Got exit code 20 (Master dead).
Thu Aug 20 18:37:18 2020 - [info] MHA::MasterFailover version 0.56.
Thu Aug 20 18:37:18 2020 - [info] Starting master failover.
Thu Aug 20 18:37:18 2020 - [info]
Thu Aug 20 18:37:18 2020 - [info] * Phase 1: Configuration Check Phase..
Thu Aug 20 18:37:18 2020 - [info]
Thu Aug 20 18:37:19 2020 - [info] GTID failover mode = 0
Thu Aug 20 18:37:19 2020 - [info] Dead Servers:
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:19 2020 - [info] Checking master reachability via MySQL(double check)...
Thu Aug 20 18:37:19 2020 - [info] ok.
Thu Aug 20 18:37:19 2020 - [info] Alive Servers:
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.142(192.168.43.142:3306)
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.143(192.168.43.143:3306)
Thu Aug 20 18:37:19 2020 - [info] Alive Slaves:
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.142(192.168.43.142:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:19 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:19 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.143(192.168.43.143:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:19 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:19 2020 - [info] Not candidate for the new Master (no_master is set)
Thu Aug 20 18:37:19 2020 - [info] Starting Non-GTID based failover.
Thu Aug 20 18:37:19 2020 - [info]
Thu Aug 20 18:37:19 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Aug 20 18:37:19 2020 - [info]
Thu Aug 20 18:37:19 2020 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Aug 20 18:37:19 2020 - [info]
Thu Aug 20 18:37:19 2020 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Aug 20 18:37:19 2020 - [info] Executing master IP deactivation script:
Thu Aug 20 18:37:19 2020 - [info] /scripts/master_ip_failover --orig_master_host=192.168.43.141 --orig_master_ip=192.168.43.141 --orig_master_port=3306 --command=stopssh --ssh_user=root
VIP Command: start=sudo /sbin/ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down
Disabling the VIP on old master: 192.168.43.141
Thu Aug 20 18:37:19 2020 - [info] done.
Thu Aug 20 18:37:19 2020 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Aug 20 18:37:19 2020 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Aug 20 18:37:19 2020 - [info]
Thu Aug 20 18:37:19 2020 - [info] * Phase 3: Master Recovery Phase..
Thu Aug 20 18:37:19 2020 - [info]
Thu Aug 20 18:37:19 2020 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Aug 20 18:37:19 2020 - [info]
Thu Aug 20 18:37:19 2020 - [info] The latest binary log file/position on all slaves is mysql-bin.000005:154
Thu Aug 20 18:37:19 2020 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.142(192.168.43.142:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:19 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:19 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.143(192.168.43.143:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:19 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:19 2020 - [info] Not candidate for the new Master (no_master is set)
Thu Aug 20 18:37:19 2020 - [info] The oldest binary log file/position on all slaves is mysql-bin.000005:154
Thu Aug 20 18:37:19 2020 - [info] Oldest slaves:
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.142(192.168.43.142:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:19 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:19 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Aug 20 18:37:19 2020 - [info] 192.168.43.143(192.168.43.143:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:19 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:19 2020 - [info] Not candidate for the new Master (no_master is set)
Thu Aug 20 18:37:19 2020 - [info]
Thu Aug 20 18:37:19 2020 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Thu Aug 20 18:37:19 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] Fetching dead master's binary logs..
Thu Aug 20 18:37:20 2020 - [info] Executing command on the dead master 192.168.43.141(192.168.43.141:3306): save_binary_logs --command=save --start_file=mysql-bin.000005 --start_pos=154 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/saved_master_binlog_from_192.168.43.141_3306_20200820183718.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.56
Creating /var/tmp if not exists.. ok.
Concat binary/relay logs from mysql-bin.000005 pos 154 to mysql-bin.000005 EOF into /var/tmp/saved_master_binlog_from_192.168.43.141_3306_20200820183718.binlog ..
Binlog Checksum enabled
Dumping binlog format description event, from position 0 to 154.. ok.
No need to dump effective binlog data from /var/lib/mysql/mysql-bin.000005 (pos starts 154, filesize 154). Skipping.
Binlog Checksum enabled
/var/tmp/saved_master_binlog_from_192.168.43.141_3306_20200820183718.binlog has no effective data events.
Event not exists.
Thu Aug 20 18:37:20 2020 - [info] Additional events were not found from the orig master. No need to save.
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] * Phase 3.3: Determining New Master Phase..
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Thu Aug 20 18:37:20 2020 - [info] All slaves received relay logs to the same position. No need to resync each other.
Thu Aug 20 18:37:20 2020 - [info] Searching new master from slaves..
Thu Aug 20 18:37:20 2020 - [info] Candidate masters from the configuration file:
Thu Aug 20 18:37:20 2020 - [info] 192.168.43.142(192.168.43.142:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:20 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:20 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Aug 20 18:37:20 2020 - [info] Non-candidate masters:
Thu Aug 20 18:37:20 2020 - [info] 192.168.43.143(192.168.43.143:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:37:20 2020 - [info] Replicating from 192.168.43.141(192.168.43.141:3306)
Thu Aug 20 18:37:20 2020 - [info] Not candidate for the new Master (no_master is set)
Thu Aug 20 18:37:20 2020 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Thu Aug 20 18:37:20 2020 - [info] New master is 192.168.43.142(192.168.43.142:3306)
Thu Aug 20 18:37:20 2020 - [info] Starting master failover..
Thu Aug 20 18:37:20 2020 - [info]
From:
192.168.43.141(192.168.43.141:3306) (current master)
+--192.168.43.142(192.168.43.142:3306)
+--192.168.43.143(192.168.43.143:3306)
To:
192.168.43.142(192.168.43.142:3306) (new master)
+--192.168.43.143(192.168.43.143:3306)
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] * Phase 3.4: Master Log Apply Phase..
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Thu Aug 20 18:37:20 2020 - [info] Starting recovery on 192.168.43.142(192.168.43.142:3306)..
Thu Aug 20 18:37:20 2020 - [info] This server has all relay logs. Waiting all logs to be applied..
Thu Aug 20 18:37:20 2020 - [info] done.
Thu Aug 20 18:37:20 2020 - [info] All relay logs were successfully applied.
Thu Aug 20 18:37:20 2020 - [info] Getting new master's binlog name and position..
Thu Aug 20 18:37:20 2020 - [info] mysql-bin.000002:154
Thu Aug 20 18:37:20 2020 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.43.142', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=154, MASTER_USER='mharep', MASTER_PASSWORD='xxx';
Thu Aug 20 18:37:20 2020 - [info] Executing master IP activate script:
Thu Aug 20 18:37:20 2020 - [info] /scripts/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.43.141 --orig_master_ip=192.168.43.141 --orig_master_port=3306 --new_master_host=192.168.43.142 --new_master_ip=192.168.43.142 --new_master_port=3306 --new_master_user='manager' --new_master_password='Test.123'
VIP Command: start=sudo /sbin/ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down
Set read_only=0 on the new master.
Enabling the VIP - 192.168.43.145 on the new master - 192.168.43.142
Thu Aug 20 18:37:20 2020 - [info] OK.
Thu Aug 20 18:37:20 2020 - [info] ** Finished master recovery successfully.
Thu Aug 20 18:37:20 2020 - [info] * Phase 3: Master Recovery Phase completed.
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] * Phase 4: Slaves Recovery Phase..
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Thu Aug 20 18:37:20 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] -- Slave diff file generation on host 192.168.43.143(192.168.43.143:3306) started, pid: 2783. Check tmp log /masterha/app1/192.168.43.143_3306_20200820183718.log if it takes time..
Thu Aug 20 18:37:21 2020 - [info]
Thu Aug 20 18:37:21 2020 - [info] Log messages from 192.168.43.143 ...
Thu Aug 20 18:37:21 2020 - [info]
Thu Aug 20 18:37:20 2020 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Thu Aug 20 18:37:21 2020 - [info] End of log messages from 192.168.43.143.
Thu Aug 20 18:37:21 2020 - [info] -- 192.168.43.143(192.168.43.143:3306) has the latest relay log events.
Thu Aug 20 18:37:21 2020 - [info] Generating relay diff files from the latest slave succeeded.
Thu Aug 20 18:37:21 2020 - [info]
Thu Aug 20 18:37:21 2020 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Thu Aug 20 18:37:21 2020 - [info]
Thu Aug 20 18:37:21 2020 - [info] -- Slave recovery on host 192.168.43.143(192.168.43.143:3306) started, pid: 2785. Check tmp log /masterha/app1/192.168.43.143_3306_20200820183718.log if it takes time..
Thu Aug 20 18:37:22 2020 - [info]
Thu Aug 20 18:37:22 2020 - [info] Log messages from 192.168.43.143 ...
Thu Aug 20 18:37:22 2020 - [info]
Thu Aug 20 18:37:21 2020 - [info] Starting recovery on 192.168.43.143(192.168.43.143:3306)..
Thu Aug 20 18:37:21 2020 - [info] This server has all relay logs. Waiting all logs to be applied..
Thu Aug 20 18:37:21 2020 - [info] done.
Thu Aug 20 18:37:21 2020 - [info] All relay logs were successfully applied.
Thu Aug 20 18:37:21 2020 - [info] Resetting slave 192.168.43.143(192.168.43.143:3306) and starting replication from the new master 192.168.43.142(192.168.43.142:3306)..
Thu Aug 20 18:37:21 2020 - [info] Executed CHANGE MASTER.
Thu Aug 20 18:37:21 2020 - [info] Slave started.
Thu Aug 20 18:37:22 2020 - [info] End of log messages from 192.168.43.143.
Thu Aug 20 18:37:22 2020 - [info] -- Slave recovery on host 192.168.43.143(192.168.43.143:3306) succeeded.
Thu Aug 20 18:37:22 2020 - [info] All new slave servers recovered successfully.
Thu Aug 20 18:37:22 2020 - [info]
Thu Aug 20 18:37:22 2020 - [info] * Phase 5: New master cleanup phase..
Thu Aug 20 18:37:22 2020 - [info]
Thu Aug 20 18:37:22 2020 - [info] Resetting slave info on the new master..
Thu Aug 20 18:37:22 2020 - [info] 192.168.43.142: Resetting slave info succeeded.
Thu Aug 20 18:37:22 2020 - [info] Master failover to 192.168.43.142(192.168.43.142:3306) completed successfully.
Thu Aug 20 18:37:22 2020 - [info]
----- Failover Report -----
app1: MySQL Master failover 192.168.43.141(192.168.43.141:3306) to 192.168.43.142(192.168.43.142:3306) succeeded
Master 192.168.43.141(192.168.43.141:3306) is down!
Check MHA Manager logs at mha-manager:/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.43.141(192.168.43.141:3306)
The latest slave 192.168.43.142(192.168.43.142:3306) has all relay logs for recovery.
Selected 192.168.43.142(192.168.43.142:3306) as a new master.
192.168.43.142(192.168.43.142:3306): OK: Applying all logs succeeded.
192.168.43.142(192.168.43.142:3306): OK: Activated master IP address.
192.168.43.143(192.168.43.143:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.43.143(192.168.43.143:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.43.142(192.168.43.142:3306)
192.168.43.142(192.168.43.142:3306): Resetting slave info succeeded.
Master failover to 192.168.43.142(192.168.43.142:3306) completed successfully.
(7)观察manager上的perl开启的监控进程是否存在?
[root@mha-manager ~]# ps -ef | grep perl
root 2813 2796 0 18:41 pts/0 00:00:00 grep perl
观察可知:MHA在切换完成后会结束 Manager 进程
5.还原操作
(1)删除锁文件
MHA每次故障切换后都会生成一个app1.failover.complete这样的文件,如果不加这个参数,需要删除这个文件才能再次启动
[root@mha-manager masterha]# rm -rf /masterha/app1/app1.failover.complete
或者启动时添加(--ignore_last_failover 忽略上次切换):
nohup masterha_manager --conf=/etc/masterha/app1.cnf &> /var/log/mha_manager.log --ignore_last_failover &
(2)原MASTER上从新指定slave01为新的master角色服务器
[root@slave01 opt]# mysql -uroot -p'Test.123' -e 'show master status\G;'
mysql: [Warning] Using a password on the command line interface can be insecure.
*************************** 1. row ***************************
File: mysql-bin.000002
Position: 154
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set:
[root@slave01 opt]#
[root@master opt]# /etc/init.d/mysqld start
Starting mysqld: [ OK ]
[root@master opt]# mysql -uroot -p'Test.123'
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.31-log MySQL Community Server (GPL)
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
由于演示,这里就不锁库了:
mysql> change master to master_host='192.168.43.142', master_port=3306, master_user='mharep',master_password='Test.123', master_log_file = 'mysql-bin.000002', master_log_pos=154;
Query OK, 0 rows affected, 2 warnings (0.04 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.43.142
Master_User: mharep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 154
Relay_Log_File: relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 154
Relay_Log_Space: 521
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: 0b529b47-e2b6-11ea-9d4e-000c29fb7918
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
(3)切换MASTER角色(manager上操作)这时不要开启manager进程,即不开启:nohup masterha_manager --conf=/etc/masterha/app1.cnf &> /var/log/mha_manager.log &
[root@mha-manager ~]# ps -ef | grep perl
root 2936 2796 0 18:51 pts/0 00:00:00 grep perl
[root@mha-manager ~]#
[root@mha-manager ~]#
[root@mha-manager ~]# masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.43.141 --new_master_port=3306 --orig_master_is_new_slave
# 填写: yes
Thu Aug 20 18:51:17 2020 - [info] MHA::MasterRotate version 0.56.
Thu Aug 20 18:51:17 2020 - [info] Starting online master switch..
Thu Aug 20 18:51:17 2020 - [info]
Thu Aug 20 18:51:17 2020 - [info] * Phase 1: Configuration Check Phase..
Thu Aug 20 18:51:17 2020 - [info]
Thu Aug 20 18:51:17 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug 20 18:51:17 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Aug 20 18:51:17 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Aug 20 18:51:18 2020 - [info] GTID failover mode = 0
Thu Aug 20 18:51:18 2020 - [info] Current Alive Master: 192.168.43.142(192.168.43.142:3306)
Thu Aug 20 18:51:18 2020 - [info] Alive Slaves:
Thu Aug 20 18:51:18 2020 - [info] 192.168.43.141(192.168.43.141:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:51:18 2020 - [info] Replicating from 192.168.43.142(192.168.43.142:3306)
Thu Aug 20 18:51:18 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Aug 20 18:51:18 2020 - [info] 192.168.43.143(192.168.43.143:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 20 18:51:18 2020 - [info] Replicating from 192.168.43.142(192.168.43.142:3306)
Thu Aug 20 18:51:18 2020 - [info] Not candidate for the new Master (no_master is set)
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.43.142(192.168.43.142:3306)? (YES/no): yes
Thu Aug 20 18:51:21 2020 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Thu Aug 20 18:51:21 2020 - [info] ok.
Thu Aug 20 18:51:21 2020 - [info] Checking MHA is not monitoring or doing failover..
Thu Aug 20 18:51:21 2020 - [info] Checking replication health on 192.168.43.141..
Thu Aug 20 18:51:21 2020 - [info] ok.
Thu Aug 20 18:51:21 2020 - [info] Checking replication health on 192.168.43.143..
Thu Aug 20 18:51:21 2020 - [info] ok.
Thu Aug 20 18:51:21 2020 - [info] 192.168.43.141 can be new master.
Thu Aug 20 18:51:21 2020 - [info]
From:
192.168.43.142(192.168.43.142:3306) (current master)
+--192.168.43.141(192.168.43.141:3306)
+--192.168.43.143(192.168.43.143:3306)
To:
192.168.43.141(192.168.43.141:3306) (new master)
+--192.168.43.143(192.168.43.143:3306)
+--192.168.43.142(192.168.43.142:3306)
Starting master switch from 192.168.43.142(192.168.43.142:3306) to 192.168.43.141(192.168.43.141:3306)? (yes/NO): yes
Thu Aug 20 18:51:24 2020 - [info] Checking whether 192.168.43.141(192.168.43.141:3306) is ok for the new master..
Thu Aug 20 18:51:24 2020 - [info] ok.
Thu Aug 20 18:51:24 2020 - [info] 192.168.43.142(192.168.43.142:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Thu Aug 20 18:51:24 2020 - [info] 192.168.43.142(192.168.43.142:3306): Resetting slave pointing to the dummy host.
Thu Aug 20 18:51:24 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Aug 20 18:51:24 2020 - [info]
Thu Aug 20 18:51:24 2020 - [info] * Phase 2: Rejecting updates Phase..
Thu Aug 20 18:51:24 2020 - [info]
master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes
Thu Aug 20 18:51:27 2020 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Thu Aug 20 18:51:27 2020 - [info] Executing FLUSH TABLES WITH READ LOCK..
Thu Aug 20 18:51:27 2020 - [info] ok.
Thu Aug 20 18:51:27 2020 - [info] Orig master binlog:pos is mysql-bin.000002:154.
Thu Aug 20 18:51:27 2020 - [info] Waiting to execute all relay logs on 192.168.43.141(192.168.43.141:3306)..
Thu Aug 20 18:51:27 2020 - [info] master_pos_wait(mysql-bin.000002:154) completed on 192.168.43.141(192.168.43.141:3306). Executed 0 events.
Thu Aug 20 18:51:27 2020 - [info] done.
Thu Aug 20 18:51:27 2020 - [info] Getting new master's binlog name and position..
Thu Aug 20 18:51:27 2020 - [info] mysql-bin.000006:154
Thu Aug 20 18:51:27 2020 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.43.141', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000006', MASTER_LOG_POS=154, MASTER_USER='mharep', MASTER_PASSWORD='xxx';
Thu Aug 20 18:51:27 2020 - [info]
Thu Aug 20 18:51:27 2020 - [info] * Switching slaves in parallel..
Thu Aug 20 18:51:27 2020 - [info]
Thu Aug 20 18:51:27 2020 - [info] -- Slave switch on host 192.168.43.143(192.168.43.143:3306) started, pid: 2941
Thu Aug 20 18:51:27 2020 - [info]
Thu Aug 20 18:51:28 2020 - [info] Log messages from 192.168.43.143 ...
Thu Aug 20 18:51:28 2020 - [info]
Thu Aug 20 18:51:27 2020 - [info] Waiting to execute all relay logs on 192.168.43.143(192.168.43.143:3306)..
Thu Aug 20 18:51:27 2020 - [info] master_pos_wait(mysql-bin.000002:154) completed on 192.168.43.143(192.168.43.143:3306). Executed 0 events.
Thu Aug 20 18:51:27 2020 - [info] done.
Thu Aug 20 18:51:27 2020 - [info] Resetting slave 192.168.43.143(192.168.43.143:3306) and starting replication from the new master 192.168.43.141(192.168.43.141:3306)..
Thu Aug 20 18:51:27 2020 - [info] Executed CHANGE MASTER.
Thu Aug 20 18:51:27 2020 - [info] Slave started.
Thu Aug 20 18:51:28 2020 - [info] End of log messages from 192.168.43.143 ...
Thu Aug 20 18:51:28 2020 - [info]
Thu Aug 20 18:51:28 2020 - [info] -- Slave switch on host 192.168.43.143(192.168.43.143:3306) succeeded.
Thu Aug 20 18:51:28 2020 - [info] Unlocking all tables on the orig master:
Thu Aug 20 18:51:28 2020 - [info] Executing UNLOCK TABLES..
Thu Aug 20 18:51:28 2020 - [info] ok.
Thu Aug 20 18:51:28 2020 - [info] Starting orig master as a new slave..
Thu Aug 20 18:51:28 2020 - [info] Resetting slave 192.168.43.142(192.168.43.142:3306) and starting replication from the new master 192.168.43.141(192.168.43.141:3306)..
Thu Aug 20 18:51:28 2020 - [info] Executed CHANGE MASTER.
Thu Aug 20 18:51:28 2020 - [info] Slave started.
Thu Aug 20 18:51:28 2020 - [info] All new slave servers switched successfully.
Thu Aug 20 18:51:28 2020 - [info]
Thu Aug 20 18:51:28 2020 - [info] * Phase 5: New master cleanup phase..
Thu Aug 20 18:51:28 2020 - [info]
Thu Aug 20 18:51:28 2020 - [info] 192.168.43.141: Resetting slave info succeeded.
Thu Aug 20 18:51:28 2020 - [info] Switching master to 192.168.43.141(192.168.43.141:3306) completed successfully.
[root@mha-manager ~]#
(4)分析转移情况
[root@slave02 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G' | grep Master_Host
mysql: [Warning] Using a password on the command line interface can be insecure.
Master_Host: 192.168.43.141
[root@slave01 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G' | grep Master_Host
mysql: [Warning] Using a password on the command line interface can be insecure.
Master_Host: 192.168.43.141
[root@slave01 opt]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:fb:79:18 brd ff:ff:ff:ff:ff:ff
inet 192.168.43.142/24 brd 192.168.43.255 scope global eth0
inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1
inet6 fe80::20c:29ff:fefb:7918/64 scope link
valid_lft forever preferred_lft forever
结论是:数据库主从转移成功,但vip没有迁移
(5)手动修复vip问题
[root@slave01 opt]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:fb:79:18 brd ff:ff:ff:ff:ff:ff
inet 192.168.43.142/24 brd 192.168.43.255 scope global eth0
inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1
inet6 fe80::20c:29ff:fefb:7918/64 scope link
valid_lft forever preferred_lft forever
[root@slave01 opt]# ip addr delete 192.168.43.145/32 dev eth0:1
[root@slave01 opt]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:fb:79:18 brd ff:ff:ff:ff:ff:ff
inet 192.168.43.142/24 brd 192.168.43.255 scope global eth0
inet6 fe80::20c:29ff:fefb:7918/64 scope link
valid_lft forever preferred_lft forever
[root@master opt]# ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255
[root@master opt]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:b2:68:80 brd ff:ff:ff:ff:ff:ff
inet 192.168.43.141/24 brd 192.168.43.255 scope global eth0
inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1
inet6 fe80::20c:29ff:feb2:6880/64 scope link
valid_lft forever preferred_lft forever
五、总结
优点:
开源,用Perl编写。
方案成熟,故障切换时,MHA会做日志补齐操作,尽可能减少数据丢失,保证数据一。
部署不需要改变现有架构。
限制:
各个节点要打通SSH信任,有一定的安全隐患。
没有Slave的高可用。
自带的脚本不足,例如虚IP配置需要自己写命令或者依赖其他软件。
需要手动清理中继日志。
总的来说,MHA是一套非常优秀而且使用比较广的高可用程序,它可以自动补齐日志使得一致性有保证,部署的时候不需要改变原有架构就可以使用。
但是使用起来还是有一点复杂的,因为MHA不接管VIP,所以要自己写脚本实现,而且只保证Master高可用,没有Slave高可用,还有就是中继日志要自己设定时任务来清理。
不管怎么说,在没有更好的方案下,MHA还是值得使用的。
参阅:
https://blog.51cto.com/14154700/2472806
https://github.com/yoshinorim/mha4mysql-manager/wiki/Installation#installing-mha-node
https://segmentfault.com/a/1190000017486693
附:master_ip_failover文件内容(vip为192.168.43.145)
[root@mha-manager masterha]# cat /scripts/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
use MHA::DBHelper;
my (
$command, $ssh_user, $orig_master_host,
$orig_master_ip, $orig_master_port, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password
);
my $vip = '192.168.43.145';
my $key = '1';
my $ssh_start_vip = "sudo /sbin/ifconfig eth0:$key $vip netmask 255.255.255.255";
my $ssh_stop_vip = "sudo /sbin/ifconfig eth0:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
sub main {
print "\n\n VIP Command: start=$ssh_start_vip stop=$ssh_stop_vip\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
# updating global catalog, etc
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );
## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();
print "Set read_only=0 on the new master.\n";
$new_master_handler->disable_read_only();
$new_master_handler->disconnect();
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
# If you want to continue failover, exit 10.
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Check script.. OK \n";
# do nothing
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
[root@mha-manager masterha]#