Simple Stateful Load Balancer with iptables and NAT

NOTE: To demonstrate how iptables can perform network address translation this how-to shows how to use it to implement a over-simplified load balancer. In practice we would use a daemon such as HAProxy allowing IP tables to check packets before forwarding them. Using the method presented in this tutorial packets get forwarded without going through the INPUT, FORWARD and OUTPUT chains.

iptables is a powerful tool that is used to create rules for how incoming or outgoing packets are handled. It keeps track of a packets state – there is NEW, ESTABLISHED, RELATED, INVALID and UNTRACKED. It can make filtering decisions based on the packets header data and the payload section of the packet, for these purposes iptables even has regular expression matching.

On top of that iptables has extensions that can be used to filter packets based on a packets history so we can keep track of packets and sessions. We can set filters to only trigger at specific times, parse the packet contents and header information searching for specific patterns, differentiate protocols such as tcp, udp, icmp, etc. For load balancing behavior we want the incoming packets on one machine to be routed to another machine. iptables has extentions that helps us achieve this aim but we also need to muck around with its internal PREROUTING and POSTROUTING table, which is not recommended as this could potentially pose a security risk. lets use iptables to route all traffic coming in on an interface eth0 with a destination port 80 and route it to another IP address:

Allow IP forwarding

(Note: if your testing this on the same box your doing this on it won’t work, you need at least 3 machines to test this out, virtual ones work nicely)

First we enable ipv4 forwarding or this will not work:
# echo "1" > /proc/sys/net/ipv4/ip_forward


# sysctl net.ipv4.ip_forward=1

next we add a filter that changes the packets destination ip and allows us to masquerade:

# iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j DNAT --to-destination
# iptables -t nat -A POSTROUTING -j MASQUERADE

The above filter gets added to iptables PREROUTING chain. The packets first go through the filters in the PREROUTING chain before iptables decides where they go. The above filter says all packets input into eth0 that use tcp protocol and have a destination port 80 will have their destination address changed to port 80. The DNAT target in this case is responsible for changing the packets Destination IP address. Variations of this might include mapping to a different port on the same machine or perhaps to another interface all together, that is how one could implement a simple stateful vlan (in theory).

The masquerade option acts as a one to many NAT server allowing one machine to route traffic with one centralized point of access. This is similar to how many commercial firewalls and network routers function.

The above ruleset results in all incoming packets to dport 80 traversing the iptables chains in a straight line from INCOMING to OUTGOING in the image below, effectively bypassing any rules we might have had in our INPUT chain. If we were to choose to implement nat like this we would need to implement those – our desired INPUT filter rules – on the machines where traffic is forwarded OR add them to the FORWARD chain if we want to block things before they are forwarded (Note: packets might go through FORWARD chain in both directions so direction needs to be considered when writing filters for this chain).

Path incoming packets take through iptables chains


Balancing the load

So now we have most of our ingredients for a simple stateful load balancer – we can forward incoming tcp traffic to a local computer that serves up a website but we can only do it for one host, if we added any more then they would never have any traffic forwarded to them as the first rule in the PREROUTING chain would match all http packets. We need a way to demultiplex the incoming packets among several hosts. Now we’ll implement a way to equally distribute traffic among all the hosts. We’ll use 4 hosts and our node balancer will be We can use the iptables statistic module to create a set filters that redirect a packet to a different host every time we need to keep track of how many new packets come in. They can’t be established or related or anything but new otherwise a packet thats already been established would be routed to a different host then the one it was already “chatting” with which could break the application.


Method 1:

One way to use the statistic module is to count new packets and reset our counter every time it reaches the number of hosts that are able to serve.

# iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 80 -m state --state NEW -m statistic --mode nth --every 4 --packet 0 -j DNAT --to-destination
# iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 80 -m state --state NEW -m statistic --mode nth --every 4 --packet 1 -j DNAT --to-destination
# iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 80 -m state --state NEW -m statistic --mode nth --every 4 --packet 2 -j DNAT --to-destination
# iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 80 -m state --state NEW -m statistic --mode nth --every 4 --packet 3 -j DNAT --to-destination

The above filters do the same as the previous example except now its incrementing a counter every time a new packet arrives on port 80. Once the counter reaches 4 it resets. The counters value is what iptables uses to match the –packet # and route the traffic appropriately. The above filters provide the kind of behavior expected from a load balancer however what if we didn’t want to distribute traffic equally to all the servers; what if one of them could handle less traffic then the other three?

Method 2:

The statistic module includes another mode probability. This mode uses a random number generator and tunes it based on the probability distribution you give it. Say you have 2 servers, you could give each a probability of 50% in which case the outcome would be the same as our example using a counter above. But if you wanted to split it based on some sort of performance criterion you could specify the distribution:

# iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -m state --state NEW -m statistic --mode random --probability .25 -j DNAT --to-destination
# iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -m state --state NEW -m statistic --mode random --probability .25 -j DNAT --to-destination
# iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -m state --state NEW -m statistic --mode random --probability .25 -j DNAT --to-destination
# iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -m state --state NEW -m statistic --mode random --probability .25 -j DNAT --to-destination

In this example ruleset the –probability flag is what sets the probability of each host being hit with the traffic. The –mode flag has to be random if your using probability.

All this having been said, it’s much better to use a daemon like HAProxy. With a daemon the incoming packet actually reaches and traverses the INPUT chain allowing for the load balancing machine to also filter maliciously crafted or incoherent packets before they get sent to the daemon to be forwarded.

Related Articles

  • How to List Compiled PHP Modules from the Command Line

    The general command is php -m; this command will give you the full list of extensions: php -m This command will give you an output like: bz2 calendar Core ctype...
  • Extract Tar Files to a Different Directory

    Syntax Typical Unix tar syntax: tar -xf -C /path/to/directory GNU/tar syntax: tar xf file.tar -C /path/to/directory tar xf file.tar --directory /path/to/directory Example: Extract files to another directory In this...

    What is a LAMP Stack? “LAMP stack is a popular open source web platform commonly used to run dynamic web sites and servers. It includes Linux, Apache, MySQL, and PHP/Python/Perl...
  • How to get rid of ^M characters.

    When you edit a file in Windows and then open in UNIX, you must have seen ^M characters getting appended in the content. How to get rid of it ?...
  • Danny Sauer

    Have you tried this code? :) Because it doesn’t work on current Linux. In both cases (the nth method and the probability method), the issue is that the rules are independent of each other and terminal. I think that older iptables implementations used a shared counter, so the “nth” thing above did work before – but no longer. I did some testing on RHEL6 over the last couple of weeks, and found the slightly more difficult mechanism that now works.

    For the “nth” mechanism, you apparently need to do “–every 4 –packet 0″, then “–every 3 –packet 0″ and so forth. The solution above sends 1 of every 4 packets to the first server. But then the next rule sends 1/4 of the remaining 3 packets which didn’t match the first rule. Then the next rule sends 1/4 of the remainder which made it through the next rule, and so on. At the end of this scenario with four machines, almost 1/3 of the traffic ends up not forwarded. The way it works now is “one of every four packets match rule 1″ then “one of every 3 packets match rule 2″ (because there are 3 which pass through), then “one of every 2 packets match rule 3″ and “one of every 1 packets match rule 4″.

    The random distribution needs to work the same way for the same reason, but the math is slightly harder. Say you have four machines that you want to sent 25% of the traffic to. The first one is easy; you set probability to .25. But then the next rule, you want 25% of the original, but the value you put is describes the percentage in terms of the remaining 75%. That’s 25%/75%, or .3333. Then the third rule is only seeing 50% of the original traffic, so it needs to match 25%/50%, or .50. And the fourth one needs to match 100% of what came through (which should be “about” 25% of the original volume), so I just use an ACCEPT rule to make sure it catches “the rest”. So, you use the percentage of the original that you want to match, divided by the percentage of the traffic which remains after subtracting out the amount matched by the previous rules. Match 25% of 100%, 25% of 75%, 25% of 50%, 25% of 25%, or .25, .3333, .5, 1.0.