Contributed by dlg on from the dept.
A long time ago I started working toward supporting active-active stateful firewalls. I was fortunate enough to be able to do it as part of my studies, which had the unfortunate side effect that I had to write a paper about it rather than actually fixing the bugs in the code. However, I'm happy to say that I finally got it all working, ran it in production, and even committed to the tree.
For those of you who remember the paper I wrote, there were two outstanding problems with the code which stopped it working.
The first problem was the deferral of the first packets in a state was always hitting the timeout before sending the packet on rather than getting an acknowledgement from a peer. The reason for this turned out to be pretty easy to fix. pf calls into pfsync twice when a state is created, once when the packet is inserted into the state tree, and again soon after to see if the packet should be deferred or not. I originally thought these two calls were made in the reverse order (check for defer before handling the insert), which meant my handling of the state was incorrect. As well as fixing the code to handle the correct order of operations, I also added a knob so the packet deferral by pfsync can be turned on and off. By default it is off.
The second problem was a lot harder to handle. When pfsync gets an update from it's peers, it has to merge their details into the local state tree and figure out if there are some changes made to the local tree that the peers need to know about for that state. This code made my head hurt, but eventually through some guesswork and a lot of testing I think I've got it right.
So pfsync now works if you run traffic over both legs of your firewalls. I'm doing this on my firewalls at work, and it works surprisingly well.
In my setup I have 30 vlans trunked over a single em(4) controller in each firewall. 29 of these vlans are considered internal networks. Routing for these internal networks is provided by carp interfaces on the firewalls. At the moment carp is set up so only one of the firewalls is the master on any particular vlan. The 30th vlan is the network I talk to the upstream provider on. We use OSPF on that interface to advertise the networks I host and for us to learn the default route and so on from our provider. ospfd is configured to only announce the networks on carp interfaces that are the master.
Because ospfd only announces the networks that the particular firewall is a carp master for, traffic in and out of the internal network tends to flow in and out over the same firewall. This helps localize the state updates for a significant portion of our traffic, therefore reducing the need for pfsync to exchange information for those particular flows. In fact, if my networks only ever talked to hosts via the upstream routers, the previous version of pfsync would have worked fine for me simply because the state updates from actual traffic was always on the same firewall.
However, the new pfsync code is necessary when my internal networks talk to each other. The problem occurs if I have two vlans, eg, vlan1 and vlan2, and the carp master for each of these networks is on different firewalls. Let's call them fw1 and fw2 for the sake of this disussion. If the first firewall is the master for vlan1, traffic from vlan1 to vlan2 will flow into fw1 and out of it again onto vlan2, but the replies from vlan2 will come into fw2 and out of it again. This is the split brain setup that pfsync previously could not cope with. In this situation pfsync will now detect that the traffic is flowing over both firewalls for this one state and will start to exchange updates more rapidly for it.
There are some limits with how fast the traffic in a split brain setup will move because of how pfsync traffic is mitigated now. The TCP windows in a state will only progress as fast as pfsync will exchange updates between your peers, which in turn limits how fast TCP can ramp up to. In my particular environment this hasn't been a problem though. We just don't do enough high speed TCP transfers to be affected by this.
Anyway, to prove that I am doing active-active now, here are some graphs showing the traffic seen on the switch ports my 30 vlans are trunked over. The graph starts with the firewalls in active-passive. See if you can pick when I switched the master role, then switched to active-active and then chickened out. I manned up again shortly after though and it's been running active-active since then.
These changes are in current now, and hopefully in snapshots too. I'm extremely keen for people to try them out (don't forget to go ifconfig pfsync0 defer) and see how their setups behave. I'd love to see what interaction it has with carp load balancing, or setups with routing protocols and multiple routes. I'd especially love to know what performance limits people hit with active-active too.
Again, thanks must go to Ryan McBride for helping me figure this stuff out, and to Stuart Henderson for testing my changes.
(Comments are closed)