配置

使用keepalived做主备，其中一台设置为master，一台设置为backup。当master出现异常后，backup自动切换为master。当backup成为master后，master恢复正常后会再次抢占成为master，导致不必要的主备切换。因此可以将两台keepalived初始状态均配置为backup，设置不同的优先级，优先级高的设置nopreempt解决异常恢复后再次抢占的问题。

有如下配置表示意思也比较简单，VIP为192.168.0.18，2台机器的初始state都是BACKUP，machineA的优先级是15，machineB的优先级是13，配置了/root/1.sh这个来检测服务是否正常。

machineA机器配置：

[root@iloqg8n3yb9mje ~]# cat /etc/keepalived/keepalived.conf 
! Configuration File for keepalived
global_defs {
   router_id  iloqg8n3yb9mje
   script_user root
   enable_script_security
}

vrrp_script check_mysql {
    script "/root/1.sh"
    interval 10
    timeout 5
    weight 5
    fall 3
}
vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface eth0
    virtual_router_id 18
    priority 15
    advert_int 1			#检查间隔，默认1秒 VRRP心跳包的发送周期，单位为s 组播信息发送间隔，两个节点设置必须一样
    authentication {
        auth_type PASS
        auth_pass 1002
    }
    track_script {
        check_mysql
    }
    virtual_ipaddress {
       192.168.0.18 dev eth0 label eth0:0
    }
    notify_master "/root/2.sh master"
    notify_backup "/root/2.sh backup"
    notify_fault "/root/2.sh fault"
    notify "/root/2.sh notify..."
}

machineB配置：

global_defs {
   router_id 4n1eq6wnfvdwvj
   script_user root
   enable_script_security
}

vrrp_script check_mysql {
    script "/root/1.sh"
    interval 10
    timeout 5
    weight 5
    fall 3
}
vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface eth0
    virtual_router_id 18
    priority 13
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1002
    }
    track_script {
        check_mysql
    }
    virtual_ipaddress {
       192.168.0.18 dev eth0 label eth0:0
    }
    notify_master "/root/2.sh master"
    notify_backup "/root/2.sh backup"
    notify_fault "/root/2.sh fault"
    notify "/root/3.sh"
}

/root/1.sh配置如下：这个脚本用来检测服务是否正常，这个为了测试，设置当 /etc/keepalived/down这个文件存在返回值为0，反之为1

1 2	#!/bin/bash /bin/test -f /etc/keepalived/down && exit 0 \|\| exit 1

vrrp_script

配置

vrrp_script是指通过脚本来检测服务是否正常，通过 man keepalived.conf 查看其参数的意思。

vrrp_script <SCRIPT_NAME> {
           script <STRING>|<QUOTED-STRING> # path of the script to execute，需要运行的脚本，返回值为0表示正常，
           interval <INTEGER>  # seconds between script invocations, default 1 second ，脚本运行时间，即隔多少秒去检测
           timeout <INTEGER>   # seconds after which script is considered to have failed，脚本运行的超时时间。
           weight <INTEGER:-254..254>  # adjust priority by this weight, default 0
           rise <INTEGER>              # required number of successes for OK transition，配置几次检测成功才认为服务正常
           fall <INTEGER>              # required number of successes for KO transition，配置几次检测失败才认为服务异常
           user USERNAME [GROUPNAME]   # user/group names to run script under
                                       #   group default to group of user
           init_fail                   # assume script initially is in failed state，配置初始时失败状态
        }

以上文的配置：

vrrp_script check_mysql {
    script "/root/1.sh"
    interval 10
    timeout 5
    weight 5
    fall 3
}

我们把/etc/keepalived/down目录删除之后，machineA，17:45:06有第一次检测异常，后面再过了20秒之后，直接提示了failed，同时优先级从20减为了15。说明需要达到fall的次数之后才会切优先级。以下是从message日志里面看到的：

1
2
3

Dec 24 17:45:06 iloqg8n3yb9mje Keepalived_vrrp[109141]: Script `check_mysql` now returning 1
Dec 24 17:45:26 iloqg8n3yb9mje Keepalived_vrrp[109141]: VRRP_Script(check_mysql) failed (exited with status 1)
Dec 24 17:45:26 iloqg8n3yb9mje Keepalived_vrrp[109141]: (VI_1) Changing effective priority from 20 to 15

machineB，17:44:45检测到正常之后，就直接调整优先级了，说明rise的默认值为1。

Dec 24 17:44:15 4n1eq6wnfvdwvj Keepalived_vrrp[51077]: /root/1.sh exited with status 1
Dec 24 17:44:25 4n1eq6wnfvdwvj Keepalived_vrrp[51077]: /root/1.sh exited with status 1
Dec 24 17:44:35 4n1eq6wnfvdwvj Keepalived_vrrp[51077]: /root/1.sh exited with status 1
Dec 24 17:44:45 4n1eq6wnfvdwvj Keepalived_vrrp[51077]: VRRP_Script(check_mysql) succeeded
Dec 24 17:44:46 4n1eq6wnfvdwvj Keepalived_vrrp[51077]: VRRP_Instance(VI_1) Changing effective priority from 13 to 18

日志显示优先级有做了切换，但是其他事情都没有做，VIP未没有正常切换。这是为什么呢？

原因分析

参考 keepalived之vrrp_script详解的说法：

vrrp_script 里的script返回值为0时认为检测成功，其它值都会当成检测失败；

weight 为正时，脚本检测成功时此weight会加到priority上，检测失败时不加；
1. 主失败:
  1. 主 priority < 从 priority + weight 时会切换。
2. 主成功：
  1. 主 priority + weight > 从 priority + weight 时，主依然为主
weight 为负时，脚本检测成功时此weight不影响priority，检测失败时priority – abs(weight)
1. 主失败:
  1. 主 priority – abs(weight) < 从priority 时会切换主从
2. 主成功:
  1. 主 priority > 从priority 主依然为主

实测并不是这个结论，比较怀疑是版号不一致导致出现的结论不一样，但不管怎么说，VIP并未发生切换，所以跟想像中的不一样。

突发奇想，如果在vrrp_script不配置weight值，会怎么样呢？以下都是在machineA上面显示的日志：

# 当脚本check_mysql检测失败的时候，VI_1这个实例就进入了FAULT状态
Dec 27 10:42:08 iloqg8n3yb9mje Keepalived_vrrp[120464]: Script `check_mysql` now returning 1
Dec 27 10:42:28 iloqg8n3yb9mje Keepalived_vrrp[120464]: VRRP_Script(check_mysql) failed (exited with status 1)
Dec 27 10:42:28 iloqg8n3yb9mje Keepalived_vrrp[120464]: (VI_1) Entering FAULT STATE

# 当脚本check_mysql恢复正常时，由于配置了nopreempt，VI_1这个实例就进入了BACKUP状态，注意machineA的优先级更高
Dec 27 10:47:28 iloqg8n3yb9mje Keepalived_vrrp[120464]: Script `check_mysql` now returning 0
Dec 27 10:47:28 iloqg8n3yb9mje Keepalived_vrrp[120464]: VRRP_Script(check_mysql) succeeded
Dec 27 10:47:28 iloqg8n3yb9mje Keepalived_vrrp[120464]: (VI_1) Entering BACKUP STATE

# machineB失败时，machineA就主动进入了MASTER状态
Dec 27 10:48:40 iloqg8n3yb9mje Keepalived_vrrp[120464]: (VI_1) Backup received priority 0 advertisement
Dec 27 10:48:41 iloqg8n3yb9mje Keepalived_vrrp[120464]: (VI_1) Receive advertisement timeout
Dec 27 10:48:41 iloqg8n3yb9mje Keepalived_vrrp[120464]: (VI_1) Entering MASTER STATE
Dec 27 10:48:41 iloqg8n3yb9mje Keepalived_vrrp[120464]: (VI_1) setting VIPs.
Dec 27 10:48:41 iloqg8n3yb9mje Keepalived_vrrp[120464]: Sending gratuitous ARP on eth0 for 192.168.0.18
Dec 27 10:48:41 iloqg8n3yb9mje Keepalived_vrrp[120464]: (VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.0.18

由此，可以说明 vrrp_script可以不配置weight值，并且也不需要配置这个值，以避免意外情况发生。

另外，如果有遇到如下报错：

1
2

Dec 24 17:41:50 iloqg8n3yb9mje Keepalived_vrrp[108697]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Dec 24 17:41:50 iloqg8n3yb9mje Keepalived_vrrp[108697]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.

应该不会影响，但是可以在global配置项里面加上之后就不会有这个提示了。

1 2	script_user root enable_script_security

那么直接在vrrp_script下面写成 script "test -f /etc/keepalived/down && exit 0 || exit 1"是否可以呢？经测试是有问题的。

notify

notify的用法：

notify_master:当当前节点成为master时，通知脚本执行任务(一般用于启动某服务，比如nginx,haproxy等
notify_backup:当当前节点成为backup时，通知脚本执行任务(一般用于关闭某服务，比如nginx,haproxy等)
notify_fault：当当前节点出现故障，执行的任务;
notify表示只要状态切换都会调用的脚本，并且该脚本是在以上三个脚本执行之后再调用的

根据文档所写，notify会自动传以下参数：

$1 = "GROUP"|"INSTANCE"
$2 = name of the group or instance
$3 = target state of transition ("MASTER"|"BACKUP"|"FAULT")
$4 = priority value

所以要使用notify时，不需要接参数，跟其他的三个是有所区别的。

notify_master "/root/2.sh master"
notify_backup "/root/2.sh backup"
notify_fault "/root/2.sh fault"
notify "/root/3.sh"

脚本内容很简单，只是打印日志出来而出，如下：

[root@4n1eq6wnfvdwvj ~]# cat 2.sh 
#!/bin/bash

echo "`date +"%F %T"` $1" >>/tmp/fdm.txt

[root@4n1eq6wnfvdwvj ~]# cat 3.sh 
#!/bin/bash

TYPE=$1
NAME=$2
STATE=$3
case $STATE in
        "MASTER") echo "`date +"%F %T"` notify $1 $2 MASTER..." >>/tmp/fdm.txt
                  ;;
        "BACKUP") echo "`date +"%F %T"` notify $1 $2 BACKUP..." >>/tmp/fdm.txt
                  ;;
        "FAULT")  echo "`date +"%F %T"` notify $1 $2 FAULT..." >>/tmp/fdm.txt
                  exit 0
                  ;;
        *)        echo "`date +"%F %T"` NO TYPE:$1 $2" >>/tmp/fdm.txt
                  exit 1
                  ;;
esac

输出的日志如下：

2020-12-27 22:31:12 backup
2020-12-27 22:31:12 notify INSTANCE VI_1 BACKUP...
2020-12-27 22:31:12 fault
2020-12-27 22:31:12 notify INSTANCE VI_1 FAULT...

可以看到，notify的通知在notify_backup的后面。

脑裂问题

上文所述的都是业务服务异常了，导致的切换。那主备2台机器不通的情况下，keepalived会做什么操作呢？

VRRP控制报文只有一种：VRRP通告(advertisement)，使用通过advert_int 1这个参数来发送通告包的时延，默认是1秒发一次通告包。使用IP多播数据包进行封装，组地址为224.0.0.18，发布范围只限于同一局域网内。这保证了VRID在不同网络中可以重复使用。为了减少网络带宽消耗只有主控路由器才可以周期性的发送VRRP通告报文。备份路由器在连续三个通告间隔内收不到VRRP或收到优先级为0的通告后启动新的一轮VRRP选举。

一般情况下，只有主服务器会发VRRP的通告。

[root@4n1eq6wnfvdwvj ~]# tcpdump -i any -nns0 vrrp 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
23:00:22.574251 IP 192.168.0.15 > 224.0.0.18: VRRPv2, Advertisement, vrid 18, prio 15, authtype simple, intvl 1s, length 20
23:00:23.574399 IP 192.168.0.15 > 224.0.0.18: VRRPv2, Advertisement, vrid 18, prio 15, authtype simple, intvl 1s, length 20
23:00:24.574420 IP 192.168.0.15 > 224.0.0.18: VRRPv2, Advertisement, vrid 18, prio 15, authtype simple, intvl 1s, length 20
23:00:25.574504 IP 192.168.0.15 > 224.0.0.18: VRRPv2, Advertisement, vrid 18, prio 15, authtype simple, intvl 1s, length 20
23:00:26.574580 IP 192.168.0.15 > 224.0.0.18: VRRPv2, Advertisement, vrid 18, prio 15, authtype simple, intvl 1s, length 20

如果在主服务器上设置iptables规则，date +"%F %T";iptables -I OUTPUT -p vrrp -j DROP将vrrp协议发出的包禁掉，命令运行的时间为 2020-12-27 22:53:50，那么观察下备服务器的进入MASTER的时间：

1
2
3

Dec 27 22:53:54 iloqg8n3yb9mje Keepalived_vrrp[123054]: (VI_1) Receive advertisement timeout
Dec 27 22:53:54 iloqg8n3yb9mje Keepalived_vrrp[123054]: (VI_1) Entering MASTER STATE
Dec 27 22:53:54 iloqg8n3yb9mje Keepalived_vrrp[123054]: (VI_1) setting VIPs.

从上可以看出，vrrp的通告包超时了，节点进入了MASTER状态，那VIP生效的时间会延迟一秒：

[root@iloqg8n3yb9mje ~]# for i in `seq 1 100`;do ip -4 -o addr |grep 192.168.0.18 -q && echo "`date +"%F %T"` have 192.168.0.18" || echo `date +"%F %T"` no~~~;sleep 1;done
2020-12-27 22:53:50 no~~~
2020-12-27 22:53:51 no~~~
2020-12-27 22:53:52 no~~~
2020-12-27 22:53:53 no~~~
2020-12-27 22:53:54 no~~~
2020-12-27 22:53:55 have 192.168.0.18
2020-12-27 22:53:56 have 192.168.0.18

所以一般脑裂问题的排查思路有：

virtual_router_id必须一样
防火墙将vrrp广播包给过滤掉了
机器负载异常，导致机器无法正常发送、或者收到vrrp包之后没有足够的时间进行CPU的处理，这样建议可以尝试增加advert_int时间
网卡异常等

参考资料

keepalived实战

Keepalived基础知识-运维小结