Quantum is going to be a core project in OpenStack next release (Folsom).
If we use Quantum as the Network Manager, we can’t configure nova-network in multi_host mode, that’s why we loose High Availability for nova-network.
Quantum-Server can be a single-point-of-failure, that’s why I was thinking about how to fix that. As everyone in the list, I could see that Hastexo and Sebastien Han worked on Nova RA (Resource Agent) for Pacemaker.
I decided to work on Quantum Server RA and wrote something very close from other agents. You can directly have a look on the RA GitHub or follow my HowTo below.
Note : with this RA, we don’t bring HA in nova-network, but in Quantum Server service only. I think this feature will be available in a close future.
I use two VMs with Ubuntu 12.04 LTS Server installed & configured with one NIC.
Here is the network configuration :
- quantum-server-1: 192.168.2.129/24
- quantum-server-2: 192.168.2.130/24
- Install Quantum-Server, OVS Plugin (optional) and Git :
apt-get update apt-get install -y quantum-server quantum-plugin-openvswitch git
- Configure Quantum to use OVS Plugin in editing the /etc/quantum/plugins.ini file :
[PLUGIN] provider = quantum.plugins.openvswitch.ovs_quantum_plugin.OVSQuantumPlugin
- Stop quantum-server process :
service quantum-server stop
Note : if you’re using Folsom Testing Packages, you should modify /etc/quantum/quantum.conf only :
core_plugin = quantum.plugins.openvswitch.ovs_quantum_plugin.OVSQuantumPlugin
Pacemaker Basic Configuration
- Installation on both servers :
apt-get install -y pacemaker
- Configure /etc/hosts file on both servers :
192.168.2.129 quantum-server-1 192.168.2.130 quantum-server-2
- On first server :
corosync-keygen scp /etc/corosync/authkey root@quantum-server-2:/root
- On second server :
sudo mv ~/authkey /etc/corosync/authkey sudo chown root:root /etc/corosync/authkey sudo chmod 400 /etc/corosync/authkey
On both servers, configure /etc/corosync/corosync.conf file :
secauth: yes bindnetaddr: 192.168.2.0
On both servers, allow corosync to be started. To do that, modify /etc/default/corosync file :
And start the service :
- On both servers :
mkdir /usr/lib/ocf/resource.d/openstack cd /usr/lib/ocf/resource.d/openstack/ wget https://github.com/madkiss/openstack-resource-agents/raw/master/ocf/quantum-server chmod +x quantum-server crm ra info ocf:openstack:quantum-server
- On first server :
crm configure property stonith-enabled=false crm configure property no-quorum-policy=ignore crm configure rsc_defaults resource-stickiness=100 crm configure primitive p_vip ocf:heartbeat:IPaddr params ip="192.168.2.150" cidr_netmask="24" nic="eth0" op monitor interval="5s" crm configure primitive p_quantum_server ocf:openstack:quantum-server params config="/etc/quantum/quantum.conf" op monitor interval="5s" timeout="5s" crm configure group g_quantum_servers p_quantum_server p_vip
Ressource Agent Installation
You can check the cluster :
root@quantum-server-1:~# crm_mon -1 ============ Last updated: Tue Jul 17 12:19:52 2012 Last change: Tue Jul 17 11:16:30 2012 via crm_attribute on quantum-server-1 Stack: openais Current DC: quantum-server-1 - partition with quorum Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ quantum-server-1 quantum-server-2 ] p_vip (ocf::heartbeat:IPaddr): Started quantum-server-1 p_quantum_server (ocf::openstack:quantum-server): Started quantum-server-1
Configure the Quantum Host flag like this :
We actually use this IP (p_vip) for access to the Quantum Server by 9696 TCP port (by default).
Simulate a failure
We are going to stop Quantum-Server-1 :
crm node standby quantum-server-1
And check if Quantum-Server is working on Server 2 :
ps -ef | grep quantum-server
You should see quantum-server process.
To enable Server 1 after failure :
crm node online quantum-server-1
Note: Depending of your resource-stickiness value, the process can stay on Server-2.
Feel free to bring your own experience here, that’s actually not a definitive solution and let me know if something is wrong.
Of course, we don’t have a full high availability since Nova-Network process does not support it [yet].
Thank’s to Sebastien Han for his help 😉 !