1、环境描述
服务器A(主) 192.85.1.175
服务器B(从) 192.85.1.176
Mysql版本:5.1.61
系统版本:System OS:ubuntu 10.10 X86
2.安装heartbeat
1)安装heartbeat
sudo apt-get install heartbeat
2)配置说明
heartbeat的安装目录为/etc/ha.d目录下,
安装完成后,需要三个配置文件,为 ha.cf,haresources,authkeys。
此时目录下没有这三个文件,需要创建,我们可以在
/usr/share/doc/heartbeat目录里找到ha.cf、haresources、authkeys三个文件,只需将其拷贝到
/etc/ha.d目录下,即可
*.gz文件,使用 gunzip 命令解压
3.175服务器配置信息:
(1)etc/hosts 文件内容:
192.85.1.175 primary # Added by NetworkManager
(2)ha.cf 文件内容:(主配置文件)
## There are lots of options in this file. All you have to have is a set# of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast},# and a value for "auto_failback".## ATTENTION: As the configuration file is read line by line,# THE ORDER OF DIRECTIVE MATTERS!## In particular, make sure that the udpport, serial baud rate# etc. are set before the heartbeat media are defined!# debug and log file directives go into effect when they# are encountered.## All will be fine if you keep them ordered as in this example.### Note on logging:# If all of debugfile, logfile and logfacility are not defined, # logging is the same as use_logd yes. In other case, they are# respectively effective. if detering the logging to syslog,# logfacility must be "none".## File to write debug messages todebugfile /var/log/ha-debug #调试日志文件### File to write other messages to#logfile /var/log/ha-log #系统运行日志文件### Facility to use for syslog()/logger #logfacility local0 # 日志记录等级### A note on specifying "how long" times below...## The default time unit is seconds# 10 means ten seconds## You can also specify them in milliseconds# 1500ms means 1.5 seconds### keepalive: how long between heartbeats?#keepalive 2 #心跳频率,2表示2秒;200ms则表示200毫秒## deadtime: how long-to-declare-host-dead?## If you set this too low you will get the problematic# split-brain (or cluster partition) problem.# See the FAQ for how to use warntime to tune deadtime.#deadtime 30 #节点死亡时间,就是过了10秒后还没有收到心跳就认为主节点死亡## warntime: how long before issuing "late heartbeat" warning?# See the FAQ for how to use warntime to tune deadtime.#warntime 10 #告警时间### Very first dead time (initdead)## On some machines/OSes, etc. the network takes a while to come up# and start working right after you've been rebooted. As a result# we have a separate dead time for when things first come up.# It should be at least twice the normal dead time.#initdead 120 #初始化时间### What UDP port to use for bcast/ucast communication?#udpport 694 #心跳信息传递的udp端口## What interfaces to broadcast heartbeats over?#bcast eth0 # Linux #采用udp广播播来通知心跳,建议在备用节点不只一台时使用#bcast eth1 eth2 # Linux#bcast le0 # Solaris#bcast le1 le2 # Solaris## Set up a multicast heartbeat medium# mcast [dev] [mcast group] [port] [ttl] [loop]## [dev] device to send/rcv heartbeats on# [mcast group] multicast group to join (class D multicast address# 224.0.0.0 - 239.255.255.255)# [port] udp port to sendto/rcvfrom (set this value to the# same value as "udpport" above)# [ttl] the ttl value for outbound heartbeats. this effects# how far the multicast packet will propagate. (0-255)# Must be greater than zero.# [loop] toggles loopback for outbound multicast heartbeats.# if enabled, an outbound packet will be looped back and# received by the interface it was sent on. (0 or 1)# Set this value to zero.# ##bcast eth0 225.0.0.1 694 1 0## Set up a unicast / udp heartbeat medium# ucast [dev] [peer-ip-addr]## [dev] device to send/rcv heartbeats on# [peer-ip-addr] IP address of peer to send packets to#ucast eth0 192.85.1.175auto_failback on #如果主节点重新恢复过来,主节点将主动将资源抢占过来,如果为off,则只当备用节点当掉后,主节点才取回资源watchdog /dev/watchdog #看门狗。如果本节点在超过1分钟后还没有发出心跳,那么本节点自动重启# # Tell what machines are in the cluster# node nodename ... -- must match uname -nnode primary #主节点名称,与uname -n显示必须一致node backup #备用节点名称## Less common options...## Treats 10.10.10.254 as a psuedo-cluster-member# Used together with ipfail below...# note: don't use a cluster node as ping node #ping 192.85.1.1 #通过ping网关来监测心跳是否正常
(3) haresources (资源配置文件)
primary 192.85.1.177/24http,mysql,phpmyadmin #虚拟IP配置及对应的访问资源配置
(4) authkeys (认证信息配置文件)
#通讯密钥,两台机器上的文件内容必须完全一致
auth 33 md5 Hello
#authkeys需要设置读写权限:chmod 600 ./authkeys
4.176服务器配置信息:
(1)etc/hosts 文件内容:
192.85.1.176 backup # Added by NetworkManager
(2)ha.cf 文件内容:
## There are lots of options in this file. All you have to have is a set# of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast},# and a value for "auto_failback".## ATTENTION: As the configuration file is read line by line,# THE ORDER OF DIRECTIVE MATTERS!## In particular, make sure that the udpport, serial baud rate# etc. are set before the heartbeat media are defined!# debug and log file directives go into effect when they# are encountered.## All will be fine if you keep them ordered as in this example.### Note on logging:# If all of debugfile, logfile and logfacility are not defined, # logging is the same as use_logd yes. In other case, they are# respectively effective. if detering the logging to syslog,# logfacility must be "none".## File to write debug messages todebugfile /var/log/ha-debug #调试日志文件### File to write other messages to#logfile /var/log/ha-log #系统运行日志文件### Facility to use for syslog()/logger #logfacility local0 # 日志记录等级### A note on specifying "how long" times below...## The default time unit is seconds# 10 means ten seconds## You can also specify them in milliseconds# 1500ms means 1.5 seconds### keepalive: how long between heartbeats?#keepalive 2 #心跳频率,2表示2秒;200ms则表示200毫秒## deadtime: how long-to-declare-host-dead?## If you set this too low you will get the problematic# split-brain (or cluster partition) problem.# See the FAQ for how to use warntime to tune deadtime.#deadtime 30 #节点死亡时间,就是过了10秒后还没有收到心跳就认为主节点死亡## warntime: how long before issuing "late heartbeat" warning?# See the FAQ for how to use warntime to tune deadtime.#warntime 10 #告警时间### Very first dead time (initdead)## On some machines/OSes, etc. the network takes a while to come up# and start working right after you've been rebooted. As a result# we have a separate dead time for when things first come up.# It should be at least twice the normal dead time.#initdead 120 #初始化时间### What UDP port to use for bcast/ucast communication?#udpport 694 #心跳信息传递的udp端口## What interfaces to broadcast heartbeats over?#bcast eth0 # Linux #采用udp广播播来通知心跳,建议在备用节点不只一台时使用#bcast eth1 eth2 # Linux#bcast le0 # Solaris#bcast le1 le2 # Solaris## Set up a multicast heartbeat medium# mcast [dev] [mcast group] [port] [ttl] [loop]## [dev] device to send/rcv heartbeats on# [mcast group] multicast group to join (class D multicast address# 224.0.0.0 - 239.255.255.255)# [port] udp port to sendto/rcvfrom (set this value to the# same value as "udpport" above)# [ttl] the ttl value for outbound heartbeats. this effects# how far the multicast packet will propagate. (0-255)# Must be greater than zero.# [loop] toggles loopback for outbound multicast heartbeats.# if enabled, an outbound packet will be looped back and# received by the interface it was sent on. (0 or 1)# Set this value to zero.# ##bcast eth0 225.0.0.1 694 1 0## Set up a unicast / udp heartbeat medium# ucast [dev] [peer-ip-addr]## [dev] device to send/rcv heartbeats on# [peer-ip-addr] IP address of peer to send packets to#ucast eth0 192.85.1.176auto_failback on #如果主节点重新恢复过来,主节点将主动将资源抢占过来,如果为off,则只当备用节点当掉后,主节点才取回资源watchdog /dev/watchdog #看门狗。如果本节点在超过1分钟后还没有发出心跳,那么本节点自动重启# # Tell what machines are in the cluster# node nodename ... -- must match uname -nnode primary #主节点名称,与uname -n显示必须一致node backup #备用节点名称## Less common options...## Treats 10.10.10.254 as a psuedo-cluster-member# Used together with ipfail below...# note: don't use a cluster node as ping node #ping 192.85.1.1 #通过ping网关来监测心跳是否正常
(3) haresources
primary 192.85.1.177/24http,mysql,phpmyadmin #虚拟IP配置及对应的访问资源配置
(4) authkeys
#通讯密钥,两台机器上的文件内容必须完全一致
auth 33 md5 Hello
#authkeys需要设置读写权限:chmod 600 ./authkeys
5.HA服务的启动、关闭以及测试
启动HA: service heartbeat start 或 /etc/init.d/heartbeat
关闭HA; service heartbeat stop 或 /etc/init.d/heartbeat
系统在启动时已经自动把heartbeat加载了。
使用http服务测试 heartbeat
首先启动httpd服务
#service httpd start
编辑各自主机的测试用html文件,放到/var/www/html/目录下。
启动node1的heartbeat,并执行这个指令进行监控: heartbeat status
例如直接使用 http://192.85.1.177/phpmyadmin ,可以登录管理数据库