How to Set Up a High Availability Pair
A single load balancer is a single point of failure. Tula supports active-passive high availability (HA) using VRRP (Virtual Router Redundancy Protocol) via keepalived and configuration synchronisation via csync2. In an HA pair, one node actively handles traffic while the standby monitors the primary and takes over automatically if it fails. Failover typically completes in under two seconds.
Prerequisites
- Two Tula appliances on the same network segment with unique management IP addresses.
- Shared VLAN -- Both nodes must be on the same broadcast domain for VRRP communication.
- Matching firmware versions for configuration compatibility.
- Available floating IP addresses -- These VIP addresses migrate between nodes during failover.
Step 1: Configure the Primary Node
- Log in to the first Tula appliance and navigate to System > Network.
- Set the Hostname (e.g.,
tula-primary), Management IP (e.g., 10.0.1.10 -- this stays with the physical node), Default Gateway, and DNS Servers.
- Click Save and apply.
Step 2: Configure the Secondary Node
- Log in to the second appliance and navigate to System > Network.
- Set a different Hostname (e.g.,
tula-secondary) and Management IP (e.g., 10.0.1.11). Use the same gateway and DNS.
- Click Save and apply.
Step 3: Enable Cluster Mode on the Primary
- On the primary node, navigate to Cluster > Configuration.
- Click Enable Cluster.
- Configure the cluster settings:
- Cluster Name: A name for this HA pair (e.g.,
production-cluster).
- This Node Role: Select Primary.
- Peer Address: Enter the management IP of the secondary node (
10.0.1.11).
- Sync Interface: Select the network interface used for inter-node communication. This should be the management network interface.
- Authentication Key: Set a shared secret for VRRP authentication. Both nodes must use the same key.
- Click Save.
Step 4: Join the Secondary to the Cluster
- On the secondary node, navigate to Cluster > Configuration.
- Click Join Cluster.
- Configure:
- This Node Role: Select Secondary.
- Peer Address: Enter the management IP of the primary node (
10.0.1.10).
- Authentication Key: Enter the same shared secret configured on the primary.
- Click Save.
- The secondary will initiate a configuration sync from the primary. Wait for the sync to complete -- the status indicator will change to Synchronised.
Step 5: Configure Floating IPs (VRRP)
Floating IPs are the addresses clients connect to. They migrate automatically to the standby during failover.
- On the primary node, navigate to Cluster > Floating IPs and click Add Floating IP.
- Configure: IP Address (e.g.,
10.0.1.100), Interface, VRRP Group (numeric ID 1-255, unique per floating IP), and Priority (primary higher, e.g., 150 vs 100 for secondary).
- Repeat for additional floating IPs.
- Click Save and apply on both nodes.
Step 6: Verify Configuration Sync
Tula uses csync2 to keep configuration identical across nodes.
- Navigate to Cluster > Status on either node.
- Confirm: Sync Status shows Synchronised, Peer Status shows Online, and VRRP Status shows the primary as Master and secondary as Backup.
- Make a test change on the primary and verify it appears on the secondary within seconds.
Step 7: Test Failover
Testing failover before relying on it in production is essential.
- From a client machine, start a continuous test against a floating IP:
while true; do curl -s -o /dev/null -w "%{http_code} %{time_total}s\n" http://10.0.1.100/; sleep 1; done
- On the primary node, simulate a failure. You can do this by navigating to Cluster > Actions and clicking Trigger Failover, or by shutting down the primary appliance.
- Observe the client output. You should see a brief interruption (1-3 seconds) followed by successful responses as the secondary takes over.
- Check Cluster > Status on the secondary -- it should now show as Master.
- Bring the primary back online. By default, the primary will reclaim the VIP (preemption). Verify that traffic shifts back without interruption.
Recovery Procedures
Primary recovery: VRRP preemption is enabled by default -- the primary automatically reclaims the Master role when it comes back online. Disable preemption in cluster settings if you prefer manual failback.
Split-brain prevention: VRRP's priority mechanism ensures only one node holds the VIP. Use a reliable sync interface and consider a dedicated heartbeat network for critical deployments.
Resync after extended outage: Trigger a manual resync from Cluster > Actions > Force Sync before allowing a long-offline node to accept traffic.