After quite a lot of reading and a morning playing I managed to get failover Xen hosts working.
The idea was to have 2 physical servers to run 2 (or more) Xen hosts between them. If one server was to die or needed some work doing on it then the domU would automatically move to the other node.
I’ve done some testing and all appears to work fine. However let me stress that this is not live migration so you would suffer about a minute or so outage (not really a big deal in the grand scheme of things).
Click the “Read More” button for full details on the setup.
DRBD is only at version 7 in Debian Etch, so we pull in version 8 from backports.org
Version 8 comes with some helper scripts for Xen which make the failover a lot easier to configure.
These are the packages we need:
If you are unsure on how to install stuff from backports then please read the documentation located here
After installing the packages we need to build the DRBD kernel module. The easiest way to achieve this is to do:
module-assistant auto-install drbd8
Then load the module by doing
modprobe drbd
This can be installed straight from the Debian repos:
aptitude install heartbeat-2
Next there is some basic configuration to do. First we need to create /etc/heartbeat/ha.cf with the following contents:
# Enable new cluster manager crm on # Specify interface to send bcast packets out on bcast eth0 # Specify nodes in cluster, these must correspond with "uname -n" nodes xen-1 xen-2
Then we create /etc/heartbeat/authkeys with the following contents:
auth 1 1 sha1 SomeLongStringWithRandomCharsNote that SomeLongStringWithRandomChars should be randomly generated and the same on both nodes.
I generally use LVM for my Xen hosts, so first off I create all the paritions required for my 2 hosts:
lvcreate -L10G -n host1-root xenhostfs lvcreate -L1G -n host1-swap xenhostfs lvcreate -L10G -n host2-root xenhostfs lvcreate -L1G -n host2-swap xenhostfs
Next we need to create /etc/drbd.conf and configure our resources (DRBD devices):
#
# Global Parameters
#
global {
# Participate in http://usage.drbd.org
usage-count yes;
}
#
# Settings common to all resources
#
common {
# Set sync rate
syncer { rate 10M; }
# Protocol C : Both nodes have to commit before write
# is considered successful
protocol C;
net {
# Xen tests that it can write to block device
# before starting up. Not allowing this causes
# migration to fail.
allow-two-primaries;
# Split-brain recovery parameters
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
}
}
#
# Resource Definitions
#
resource "host1_root" {
on xen1 {
# The block device it will appear as
device /dev/drbd0;
# The device we are mirroring
disk /dev/xenhostfs/host1-root;
# Store DRBD meta data the above disk
meta-disk internal;
# Address of *this* host and port to replicate over
# You must use a different port for each resource
address 10.0.0.1:7790;
}
on xen2 {
device /dev/drbd0;
disk /dev/xenhostfs/host1-root;
meta-disk internal;
address 10.0.0.2:7790;
}
}
resource "host1_swap" {
on xen1 {
device /dev/drbd1;
disk /dev/xenhostfs/host1-swap;
meta-disk internal;
address 10.0.0.1:7791;
}
on xen2 {
device /dev/drbd1;
disk /dev/xenhostfs/host1-swap;
meta-disk internal;
address 10.0.0.2:7791;
}
}
resource "host2_root" {
on xen1 {
device /dev/drbd2;
disk /dev/xenhostfs/host2-root;
meta-disk internal;
address 10.0.0.1:7792;
}
on xen2 {
device /dev/drbd2;
disk /dev/xenhostfs/host2-root;
meta-disk internal;
address 10.0.0.2:7792;
}
}
resource "host2_swap" {
on xen1 {
device /dev/drbd3;
disk /dev/xenhostfs/host2-swap;
meta-disk internal;
address 10.0.0.1:7793;
}
on xen2 {
device /dev/drbd3;
disk /dev/xenhostfs/host2-swap;
meta-disk internal;
address 10.0.0.2:7793;
}
}
Now we have to initialise the metadata on the DRBD resources. This takes some time so I advise a coffee or lunch at this point ;)
drbdadm create-md host1_root drbdadm create-md host1_swap drbdadm create-md host2_root drbdadm create-md host2_swap
Now that we have done that we can start up drbd:
/etc/init.d/drbd restart
Right now all that is set up we now need to select one node as the primary, (say xen1).
On xen1 only</em> we now do the following which does 2 things:
drbdadm -- --overwrite-data-of-peer primary host1_root drbdadm -- --overwrite-data-of-peer primary host1_swap drbdadm -- --overwrite-data-of-peer primary host2_root drbdadm -- --overwrite-data-of-peer primary host2_swap
You can check the process of his by doing
cat /proc/drbd
which will give you output something like
version: 8.0.14 (api:86/proto:86)
GIT-hash: bb447522fc9a87d0069b7e14f0234911ebdab0f7 build by phil@fat-tyre, 2008-11-12 16:40:33
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:52120 nr:1016 dw:503652 dr:203481 al:121 bm:872 lo:0 pe:0 ua:0 ap:0
[>...................] sync'ed: 1.0% (10380903/10485760)K
finish: 0:07:43 speed: 10,836 (10,836) K/sec
resync: used:0/61 hits:2296 misses:10 starving:0 dirty:0 changed:10
act_log: used:0/127 hits:120616 misses:128 starving:0 dirty:7 changed:121
1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:0 nr:0 dw:4 dr:1348 al:1 bm:12 lo:0 pe:0 ua:0 ap:0
...
Note that you do not have to wait for this to finish, you can carry on using your /dev/drbdN devices but preformance will be reduced until it has completed the sync.
I’m not going to go through the complete details of this as its a reasonable assumption that if you’re reading this then you have used Xen before, or at least are capable of reading around.
This is done only on the primary hostHowever here are the edited highlights. Firstly create the filesystems:
mkfs.ext3 /dev/drbd0 mkswap /dev/drbd1 mkfs.ext3 /dev/drbd2 mkswap /dev/drbd3
Now mount and use debootstrap to create the base installs:
mkdir /mnt/host1 /mnt/host2 mount /dev/drbd0 /mnt/host1 mount /dev/drbd2 /mnt/host2 debootstrap --arch i386 etch /mnt/host1 http://ftp.uk.debian.org/debian/ debootstrap --arch i386 etch /mnt/host2 http://ftp.uk.debian.org/debian/
Create /mnt/host1/etc/fstab and /mnt/host2/etc/fstab with the following content:
proc /proc proc defaults 0 0 /dev/xvda / ext3 defaults,errors=remount-ro 0 1 /dev/xvdb none swap sw 0 0
Then unmount /mnt/host1 & /mnt/host2 and we are ready to configure Xen.
Create /etc/xen/host1.dom with the following contents:
# -*- mode: python; -*-
#----------------------------------------------------------------------------
# Kernel image file.
kernel = "/boot/vmlinuz-2.6.18-6-xen-686"
ramdisk = "/boot/initrd.img-2.6.18-6-xen-686"
# Initial memory allocation (in megabytes) for the new domain.
memory = 1024
# A name for your domain. All domains must have different names.
name = "host1"
# Filesystems
disk = [ 'drbd:host1_root,xvda,w', 'drbd:host1_swap,xvdb,w']
# Network
vif = ['bridge=xenbr0']
# Set root device.
root = "/dev/xvda ro"
# Sets runlevel 4.
extra = "4"
and
/etc/xen/host2.dom# -*- mode: python; -*-
#----------------------------------------------------------------------------
# Kernel image file.
kernel = "/boot/vmlinuz-2.6.18-6-xen-686"
ramdisk = "/boot/initrd.img-2.6.18-6-xen-686"
# Initial memory allocation (in megabytes) for the new domain.
memory = 1024
# A name for your domain. All domains must have different names.
name = "host2"
# Filesystems
disk = [ 'drbd:host2_root,xvda,w', 'drbd:host2_swap,xvdb,w']
# Network
vif = ['bridge=xenbr0']
# Set root device.
root = "/dev/xvda ro"
# Sets runlevel 4.
extra = "4"
Now on ONE of the nodes ONLY you can start the xen instances by doing:
xm create /etc/xen/host1.dom xm create /etc/xen/host2.dom
We need to create 3 XML files:
Now the documentation for these XML files is somewhat thin on the ground. The only real thing to go on is the DTD which, on Debian, lives here:/usr/lib/heartbeat/crm.dtd
You can also refer to the online version here which is always the latest version.
Anyway here are the contents of the 3 files (just save them to your home directory for now).
bootstrap.xmlTo be done on one Xen node only
<cluster_property_set id="bootstrap">
<attributes>
<nvpair id="bootstrap01" name="transition_idle_timeout" value="60"/>
<nvpair id="bootstrap02" name="default_resource_stickiness" value="0"/>
<nvpair id="bootstrap03" name="default_resource_failure_stickiness"
value="-500"/>
<nvpair id="bootstrap04" name="stonith_enabled" value="false"/>
<nvpair id="bootstrap05" name="stonith_action" value="reboot"/>
<nvpair id="bootstrap06" name="symmetric_cluster" value="true"/>
<nvpair id="bootstrap07" name="no_quorum_policy" value="stop"/>
<nvpair id="bootstrap08" name="stop_orphan_resources" value="true"/>
<nvpair id="bootstrap09" name="stop_orphan_actions" value="true"/>
<nvpair id="bootstrap10" name="is_managed_default" value="true"/>
</attributes>
</cluster_property_set>
host1.xml<resources> <primitive id="host1" class="ocf" type="Xen" provider="heartbeat"> <operations> <op id="host1-op01" name="monitor" interval="10s" timeout="60s" prereq="nothing"/> <op id="host1-op02" name="start" timeout="60s" start_delay="0"/> <op id="host1-op03" name="stop" timeout="300s"/> </operations> <instance_attributes id="host1"> <attributes> <nvpair id="host1-attr01" name="xmfile" value="/etc/xen/host1.dom"/> <nvpair id="host1-attr02" name="target_role" value="started"/> </attributes> </instance_attributes> <meta_attributes id="host1-meta01"> <attributes> <nvpair id="host1-meta-attr01" name="allow_migrate" value="true"/> </attributes> </meta_attributes> </primitive> </resources>host2.xml
<resources> <primitive id="host2" class="ocf" type="Xen" provider="heartbeat"> <operations> <op id="host2-op01" name="monitor" interval="10s" timeout="60s" prereq="nothing"/> <op id="host2-op02" name="start" timeout="60s" start_delay="0"/> <op id="host2-op03" name="stop" timeout="300s"/> </operations> <instance_attributes id="host2"> <attributes> <nvpair id="host2-attr01" name="xmfile" value="/etc/xen/host2.dom"/> <nvpair id="host2-attr02" name="target_role" value="started"/> </attributes> </instance_attributes> <meta_attributes id="host2-meta01"> <attributes> <nvpair id="host2-meta-attr01" name="allow_migrate" value="true"/> </attributes> </meta_attributes> </primitive> </resources>
Then we need to load them:
cibadmin -C -o crm_config -x bootstrap.xml cibadmin -C -o resources -x host1.xml cibadmin -C -o resources -x host2.xml
Basic monitoing is done with crm_mon. Issuing this command will result in the following which is updated every 15 seconds:
============ Last updated: Wed Apr 8 21:08:56 2009 Current DC: xen2 (1b3dbdb3-f9ca-4d16-bf7d-8a57167b85ed) 2 Nodes configured. 2 Resources configured. ============ Node: xen2 (1b3dbdb3-f9ca-4d16-bf7d-8a57167b85ed): online Node: xen1 (ea043f8c-5afc-4aee-9ffc-d17cb3cfed06): online host1 (heartbeat::ocf:Xen): Started xen1 host2 (heartbeat::ocf:Xen): Started xen1
To move host2 from xen1 to xen2 you do:
crm_resource --migrate --resource host2 --host-uname xen2Stopping/Starting
To stop host2:
crm_resource --resource host2 --set-parameter target_role \
--property-value stopped
…and to start it again:
crm_resource --resource host2 --set-parameter target_role \
--property-value started
Thanks to Doug for pointing out all my lame mistakes.