LVS Load Balancing Cluster

Overview

This document describes how to create a fault tolerant load balancer cluster using LVS and Keepalived on Debian Lenny. LVS works off of internal firewall marks that are added to incoming packets by iptables. The example here uses firewall mark 100 to manage this internal routing for the load balanced IP address 192.168.1.100. Note that LVS uses decimal 100, where iptables uses the hex equivalent (0x64).

Prerequisites and Assumptions

  • Debian Lenny - clean install
  • Internal network: 192.168.1.0/24
    • 192.168.1.250 - LVS cluster IP
    • 192.168.1.251 - lvs1
    • 192.168.1.252 - lvs2
  • Crossover network: 172.16.1.0/24
    • 172.16.1.251 - lvs1
    • 172.16.1.252 - lvs2
  • Load balanced IP: 192.168.1.100, across two servers
    • web1 - 192.168.1.10
    • web2 - 192.168.1.20
  • Firewall mark 100 (hex 0x64) in iptables

Web Servers - web1 and web2

The following items need to be completed on web1 and web2 in order for the content on these servers to be load balanced.

Loopback Network Adapter

Traffic from the load balancers to the web servers is transported over layer 2, but the packet will be addressed to the load balanced IP address. So, this IP address needs to be bound to the loopback interface with a /32 subnet, and the network stack needs to be configured to allow the packet to route from the primary network interface to the loopback. The method of binding this IP is slightly different on any given platform, but here are some general tips.

Linux

Add the following to /etc/sysctl.conf in order to prevent the loopback from responding to ARP. Strange things can occur if this is not configured correctly.

net.ipv4.conf.lo.rp_filter = 0
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1

Make sure these changes are active with

sysctl -p

Windows 2003

Windows 2003 does not allow a /32 subnet using the network configuration interface (even though it's valid), so add the IP with a /24, and edit the registry key at

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\\{blah}\SubnetMask

Also, set the DNS server for the loopback by editing the following registry key

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\\{blah}\NameServer

Windows 2008

Windows 2008 does not allow traffic to be routed between network interfaces by default, so this needs to be configured per-interface.

  1. Find the interface name from the command line
    netsh interface show interface
    
    Admin State    State          Type             Interface Name
    -------------------------------------------------------------------------
    Enabled        Connected      Dedicated        Loopback Adapter
    Disabled       Disconnected   Dedicated        Local Area Connection 2
    Enabled        Connected      Dedicated        Local Area Connection
    
  2. Configure the loopback and the primary network interface to allow inter-interface traffic
    netsh interface ipv4 set interface "Loopback Adapter" weakhostreceive=enabled store=persistent
    netsh interface ipv4 set interface "Loopback Adapter" weakhostsend=enabled store=persistent
    netsh interface ipv4 set interface "Loopback Adapter" forwarding=enabled store=persistent
    netsh interface ipv4 set interface "Local Area Connection" weakhostreceive=enabled store=persistent
    netsh interface ipv4 set interface "Local Area Connection" weakhostsend=enabled store=persistent
    netsh interface ipv4 set interface "Local Area Connection" forwarding=enabled store=persistent
    
  3. The network interfaces may need to be restarted for these settings to take effect

Current network interface settings can be viewed with the following

netsh interface ipv4 show interface "Loopback Adapter"
netsh interface ipv4 show interface "Local Area Connection"

Web Server Configuration

Because the web traffic is traversing layer 2 in order to reach the server, the web server software must be configured to listen on this address on the loopback interface.

Health Test

A test page of some sort must be configured on the web server so that the load balancers can determine the health of each server before sending traffic. This test page should be a small, lightweight page that tests any systems required for proper site functionality. Things like database connectivity are common. Whatever testing is done should result in a CONSTANT output on a successful test, and something different on a failure. The reason for this is that the load balancer does an MD5SUM on the output, and if the hash is different than expected, traffic is not sent to the server.

This example assumes that /lb.php is configured as the health test.

Ferm - lvs1 and lvs2

Ferm is a wrapper language around iptables, which makes creating new iptables rules nice and easy. This can also be done using iptables directly.

  1. Install ferm
    aptitude install ferm
  2. Create the fwmark functions that will be used later for the load balanced IP addresses in /etc/ferm.conf
    def &LVS_MARK($pub, $mark) = {
            table mangle chain PREROUTING daddr $pub proto tcp MARK set-mark $mark;
    }
    
  3. Load the required kernel modules
    modprobe iptable_mangle
    modprobe xt_multiport
    modprobe xt_MARK
    modprobe ip_vs
    modprobe ip_vs_rr
    modprobe ip_vs_nq
    modprobe ip_vs_wlc
  4. Force these modules to load automatically at boot by adding the following lines to /etc/modules
    iptable_mangle
    xt_multiport
    xt_MARK
    ip_vs
    ip_vs_rr
    ip_vs_nq
    ip_vs_wlc
  5. Configure ferm to mark incoming packets with firewall mark 100, by adding the following to /etc/ferm/ferm.conf
    &LVS_MARK(192.168.1.100, 0x0064);
  6. Reload ferm to activate this new rule
    /etc/init.d/ferm reload

    The new rule should be visible in the current iptables rules now

    # iptables -t mangle -L PREROUTING
    Chain PREROUTING (policy ACCEPT)
    target     prot opt source               destination         
    MARK       tcp  --  anywhere             192.168.1.100         tcp MARK xset 0x64/0xffffffff 
    

Keepalived and LVS - lvs1 and lvs2

  1. Install keepalived. LVS is a dependency of this package, and will be installed automatically.
    aptitude install keepalived
  2. Edit /etc/keepalived/keepalived.conf, router_id needs to be unique on each server, the rest should be identical.
    global_defs {
            router_id LVS1
    }
    
    vrrp_instance vi_100 {
            interface eth1
            lvs_sync_daemon_interface eth1
            dont_track_primary
            track_interface {
                    eth0
            }
            state BACKUP
            priority 90
            nopreempt
            virtual_router_id 100
            garp_master_delay 1
            authentication {
                    auth_type PASS
                    auth_pass $up3r$3(r37
            }
            virtual_ipaddress {
                    192.168.1.250 dev eth0
            }
            virtual_ipaddress_excluded {
                    192.168.1.100 dev eth0
            }
    }
    
    include /etc/keepalived/conf.d/*.conf
    
  3. Find the hash of the health test page. This will be used in the next step
    genhash -s 192.168.1.10 -p 80 -u /lb.php
    MD5SUM = 75ad4ab44c3ad9f892776b2487173724
  4. Create a new configuration file for this firewall mark at /etc/keepalived/conf.d/fwm-100.conf. Use the MD5SUM from above for the digest entries here.
    virtual_server fwmark 100 {
            delay_loop 30
            lb_algo wlc
            lb_kind DR
            protocol TCP
    
            ! web1
            real_server 192.168.1.10 0 {
                    weight 50
                    HTTP_GET {
                            url {
                                    path /lb.php
                                    digest f1970750b613075f64a75d0623bdda26
                            }
                            connect_port 80
                            connect_timeout 5
                            nb_get_retry 3
                            delay_before_retry 3
                    }
            }
    
            ! web2
            real_server 192.168.1.20 0 {
                    weight 50
                    HTTP_GET {
                            url {
                                    path /lb.php
                                    digest f1970750b613075f64a75d0623bdda26
                            }
                            connect_port 80
                            connect_timeout 5
                            nb_get_retry 3
                            delay_before_retry 3
                    }
            }
    
    }
    
  5. Start keepalived on both lvs1 and lvs2
    /etc/init.d/keepalived start

    192.168.1.100 and 192.168.1.250 should now be bound to one of the servers on eth0, check this with

    # ip addr show dev eth0
    2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
        link/ether 00:0c:29:7d:af:a3 brd ff:ff:ff:ff:ff:ff
        inet 192.168.1.251/24 brd 192.168.1.255 scope global eth0
        inet 192.168.1.250/32 scope global eth0
        inet 192.168.1.100/32 scope global eth0
        inet6 fe80::20c:29ff:fe7d:afa3/64 scope link 
           valid_lft forever preferred_lft forever
    

    And, both web1 and web2 should be listed in the firewall status

    # ipvsadm
    IP Virtual Server version 1.2.1 (size=4096)
    Prot LocalAddress:Port Scheduler Flags
      -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
    FWM  100 wlc persistent 3600
      -> 192.168.1.10:0                 Route   50     0          0
      -> 192.168.1.20:0                 Route   50     0          0

Other Considerations

  • If the application being load balanced does not allow sharing session state between the servers, a persistence timeout is likely required. This forces any traffic from a given source IP address to be sent to the same server until there has been more idle time from that IP than the configured timeout (in seconds). This configuration goes in the virtual_server configuration in /etc/keepalived/conf.d/fwm-100.conf
    virtual_server fwmark 100 {
            delay_loop 30
            lb_algo wlc
            lb_kind DR
            protocol TCP
            persistence_timeout 3600
    [...snip...]
    
  • If your web servers are making use of name-based virtual hosts, and separate load balancer groups are required for each site, a virtualhost entry needs to be configured so that the test page is being processed by the correct site. This also goes in the virtual_server configuration in /etc/keepalived/conf.d/fwm-100.conf
    virtual_server fwmark 100 {
            delay_loop 30
            lb_algo wlc
            lb_kind DR
            protocol TCP
            virtualhost www.example.com
    [...snip...]
    
  • A "sorry server" can be configured to handle traffic in the event that all web servers are failing the health test. This server should have a similar network configuration as the real web servers, and should return some simple "sorry" page for any URL it receives. This configuration also goes in the virtual_server configuration in /etc/keepalived/conf.d/fwm-100.conf
    virtual_server fwmark 100 {
            delay_loop 30
            lb_algo wlc
            lb_kind DR
            protocol TCP
            sorry_server 192.168.1.30 0
    [...snip...]
    

    This sorry server functionality can be handled by the LVS servers themselves, as long as the content served is extremely lightweight. To do so, bind the load balancer IP to the loopback and suppress ARP just like on web1 and web2. Then, set the sorry server to

    sorry_server 127.0.0.1 0
  • Any changes to the ferm or keepalived configuration need to be propagated to both servers. The only difference should be the router_id in /etc/keepalived/keepalived.conf. A simple script should be enough to keep these in sync, just be sure to reload both services after any change.