TITLE : Heartbeat fail-over of IP addresses OS LEVEL : Linux (SLES9) DATE : 19/05/2005 VERSION : 1.0 AUTHOR : Hubertus A. Haniel (hubba@unixcook.com) ------------------------------------------------------------------------------- This is a quick guide on how setup IP fail-over between two systems using the heartbeat subsystem for High-Availability Linux. As a first step ensure that the heartbeat package is installed. Only the basic heartbeat package is required, the yast2-heartbeat package can be installed to be able to set this configuration up through yast. This document however will only cover the manual setup. The files that need to be modified or created all live in /etc/ha.d, examples of the files to be created can be found in /usr/share/doc/packages/heartbeat. In the following example we will refer to system-a and system-b with the ip addresses of 192.168.0.1 and 192.168.0.2. Our service address we want to fail over will be 192.168.0.3 The files to be created are as follows: Defining the basic cluster configuration with the ha.cf file. ------------------------------------------------------------- /etc/ha.d/ha.cf on system-a: debugfile /var/log/ha-debug # Debug information gets logged here logfile /var/log/ha-log # Normal informational messages are here. logfacility local0 # Also log to syslog using local0 keepalive 2 # Send heartbeats every 2 seconds deadtime 30 # Declare node dead after 30 seconds warntime 10 # Issue a late heartbeat warning after 10 seconds initdead 120 # Deadtime after boot (allow for things to come up) ucast eth0 192.168.0.2 # Heartbeat to the other node auto_failback on # Fail IP back resources automatically node system-a # Define the nodes in the cluster. node system-b /etc/ha.d/ha.cf on system-b: debugfile /var/log/ha-debug # Debug information gets logged here logfile /var/log/ha-log # Normal informational messages are here. logfacility local0 # Also log to syslog using local0 keepalive 2 # Send heartbeats every 2 seconds deadtime 30 # Declare node dead after 30 seconds warntime 10 # Issue a late heartbeat warning after 10 seconds initdead 120 # Deadtime after boot (allow for things to come up) ucast eth0 192.168.0.1 # Heartbeat to the other node auto_failback on # Fail IP back resources automatically node system-a # Define the nodes in the cluster. node system-b Defining the authentication between the nodes and their password. ----------------------------------------------------------------- On both systems the /etc/ha.d/authkeys file must exist with root ownership and 0600 permissions. This file defines the authentication used between the HA daemons on the nodes. The file should look something like this: auth 1 # Send this authentication method to remote nodes 1 sha1 password # Define what the method, encryption and password Defining the resources ---------------------- The /etc/ha.d/haresources is used to define the actual resources in the cluster. This file should be the same on each node in the cluster and has the format of: For our purposes the following line would bring up eth0:0 with the service address: system-a 192.168.0.3/24/eth0/192.168.0.255 Considerations to take on the above setup ----------------------------------------- - In an ideal setup one should have multiple separate heartbeat links to avoid node isolation. - One should investigate to see if it is possible to combine the bonding driver with this setup to provide more redundancy on each node before failing the IP.