OpenBSD Journal

[c2k8]: Hackathon Summary Part 12

Contributed by mtu on from the not-dead-yet dept.

c2k8 General Hackathon (Part 12) - June 7-15, 2008, Edmonton, Alberta, Canada

The hackathons are extremely important for OpenBSD and related Projects. Even if you are not a developer, you can contribute and help by supporting any one of the many mini-hackathons held throughout the year. They are just as important as the big hackathon.

claudio and reyk at n2k8

Read on to understand why:

claudio at n2k8
I have always had great respect for Reyk Floeter (reyk@) and Claudio Jeker (claudio@). These guys are brilliant and they consistently keep adding very useful functionality for us to use. Moreover, what Claudio details below in Layer 3 reminds me of when I met Reyk for the first time doing something similar but at Layer 2. It is not quite the same but the end goal is kind of similar; provide continued networking capability.

At AUUG2005, I was watching Reyk give a talk on wireless support in OpenBSD. He topped it off with an incredible demonstration that showed off his new trunk(4) functionality. Essentially, he trunked a wireless interface in failover mode with the wired interface on his laptop and then streamed music across the wire over the network. He then pulled the cable and with a slight pause the music started streaming again but over the air. The audience was very impressed to say the least.

Claudio has just added an extremely useful routing trick of his own that I'm sure everyone will appreciate. Here is what Claudio had to say about his work that was started at n2k8, worked on at c2k8 and completed at h2k8 hackathon:

So I'm sitting here at h2k8 writing about c2k8 which was long time ago. At the same time I'm slowly getting the stuff out of my forest of source trees that piled up during n2k8 and c2k8. Theo once said that at hackathons stuff gets started or is finished but never both. There is just not enough time to write the code, test it, commit it and enjoy some beers.

I came to c2k8 with no real plan, just some initial ideas Bob (beck@) and I came up in a onsen session at n2k8. So on the flight to Edmonton I started working on a mechanism to track link states in the routing table plus adding the ability to run multiple dhclients at the same time without causing any conflicts. In Edmonton, krw@, beck@ and I set together and figured out the last important bits. I first wanted to change dhclient to make it possible to define a routing priority at which routes are added to the kernel routing table but Bob did not like that. After tossing some ideas around we came to the conclusion that an interface should get a priority that it inherits to the static routes set by route(8) and dhclient. At the beginning that sounded like a very complex solution but the implementation ended up being simple. Just adding a priority field to struct ifnet add a set/get ioctl pair to twiddle the value. Pulling that value from ifnet into the route was just a matter of finding the right spot. Changing three lines in rtsock.c did the trick.

So I started to use wireless and wired network on my laptop at the same time and plugged and unplugged the cable. It seemed to work first but netstat caused a infinite loop in the kernel -- the routing table got corrupted and the sysctl code freaked out. So most of the week at c2k8 I spent with fixing one bug after another. The routing table is insane code around an amazingly twisted data structure.

After c2k8 the resulting monster diff was put into snapshots for a few weeks and three other bugs where found. Then the release came and until now stuff went slow. Here at h2k8 I split the diff into smaller pieces and hopefully everything can be committed during this week. So that finally running wired and wireless interfaces at the same time is not as painful as before.

So when everything is in you can define a interface priority in ifconfig(8) or hostname.if and dhclient will add routes with this priority added to the default static priority (currently 8). e.g. on my laptop I have now:

em0: flags=8843 mtu 1500
        lladdr 00:0a:e4:23:e9:a9
	priority: 0
pgt0: flags=8802 mtu 1500
        lladdr 00:01:36:0d:d0:cb
	priority: 4

Resulting in a routing table like this:

> route -n show -inet
Routing tables

Internet:
Destination        Gateway            Flags   Refs      Use   Mtu  Prio Iface
default            192.168.214.254    UGS        9   402884     -     8 em0
default            10.11.12.1         UGS        0        0     -    12 pgt0
10.11.12/24        link#4             UC         1        0     -     4 pgt0
10.11.12.1         link#4             UHLc       1        0     -     4 pgt0
192.168.214/24     link#1             UC         2        0     -     4 em0
192.168.214.254    link#1             UHLc       1     9920     -     4 em0

If em(4) is disconnected the routing table changes to:

> route -n show -inet
Routing tables

Internet:
Destination        Gateway            Flags   Refs      Use   Mtu  Prio Iface
default            10.11.12.1         UGS        0        0     -    12 pgt0
default            192.168.214.254    GS         9   402888     -     8 em0
10.11.12/24        link#4             UC         1        0     -     4 pgt0
10.11.12.1         link#4             UHLc       1        0     -     4 pgt0
192.168.214/24     link#1             C          2        0     -     4 em0
192.168.214.254    link#1             HLc        0     9920     -     4 em0

The routes to the em(4) are no longer marked Up and the pgt(4) default route is now the most preferred route. Plugging em(4) back in will bring us back to the previous table. But remember unplugging em(4) will disconnect all open TCP sessions because the source address is still bound to em(4). Making this work with the same LAN on wireless and wired LAN is something for another flight to Canada or Japan.

Thanks all to organise such great events all over the world. Be it n2k8, c2k8 or h2k8 it is always great to meet friends and throw crazy ideas at each other until it crystallises into something even better. Last and least I would like to thank the airlines companies and their planes with such crappy seats that it is impossible to sleep in them without them a lot of code would still be unwritten.

:wq Claudio

claudio and reyk

Reyk Floeter (reyk@) is someone who has given so much to OpenBSD without asking for much in return. Yet, he'll be the first one to tell you that he has received a lot already. He is definitely one of my favourite OpenBSD developers and not just because of his programming skills but because of his character and attitude.

OpenBSD seems to attract many like-minded developers and it is this collective group that makes OpenBSD so great. Of course, I also believe that a great leader (developer) attracts other great leaders (developers) and Reyk is definitely one of them.

Here is what Reyk had to say about his work and time at the c2k8 hackathon:

Just a month before the c2k8 General Hackathon in Edmonton, there was the n2k8 Network Hackathon in Ito, Japan. There were two things that I totally liked about this hackathon: the great location and the focus on a specific topic. I dedicated my work to relayd(8), to fix some bugs, solve PRs, and to implement fancy new features.

My work during c2k8 was less focused, I was hacking on several different things: I completed porting a new 10G Ethernet driver, hacked a little bit on IPv6 userland, finished some work on relayd(8), implemented a new feature for carp(4), and, strange enough, worked on boot sector code. I also helped mpf@ with the trunk(4) lacp mode, like setting it up on the ProCurve switch or reviewing his diffs.

- The ix(4) driver is supporting the new Intel 82598EB PCI Express 10 Gigabit Ethernet adapters. The driver is based on the original ixgbe driver from FreeBSD, but with many changes for OpenBSD. I also didn't like their i-x-g-b-e name because it is just too long, so I renamed it to just ix(4) in OpenBSD. Like the em(4) and ixgb(4) drivers for Intel NICs, the driver is split into an OS-specific driver front-end and a portable hardware interface. The driver has significant differences to the FreeBSD driver in the bus DMA interface, the way how the card shares memory with the host OS, the interrupt handling, and the driver API, but the hardware interface is mostly identical to the FreeBSD, Linux, and probably even to the Windows version. Two bits that I really didn't like about the original driver were the non-standard integer typedefs like u8, u16, or s32 - this is a bad habit from the Linux world and they should really use the portable integer types from stdint.h or sys/types.h, and it is really bad to see the non-portable vtophys() interface in their driver to convert virtual to physical memory addresses - I converted it to use the bus_dma(9) interface which works on all platforms including sparc64. All in all, the ix(4) driver works pretty reliable and fast: I can get a net TCP performance of up to 2.9Gbps with tcpbench(1) or iperf with a single or multiple streams, and even with pf(4) enabled I can still get up to 2.5Gbps on a fast machine (thanks henning@ and mcbride@ for making pf so fast). And this is without any offloading capabilities enabled, the IP checksumming function seems to be a major bottle neck at the moment. It might be possible to get up to 3.5 or even 4.5Gbps, but this will need some more work in our network stack, and I have to fix the checksum offloading in the driver.

- The next thing I worked on was the rtsold(8), the client-side of IPv6 autoconfiguration handling ICMPv6 Router Solicitation messages. While it could be compared to DHCP or dhclient(8) in the IPv4 world, it is limited to the purpose to configure the IPv6 addresses on non-router nodes. DHCP does a lot more like providing additional options and addresses of local DNS servers, NTP time servers, etc.. In the IPv6 world this is a little bit more complicated, I synced rtsold(8) with the KAME version to understand the "Other Configuration" bit in router advertisements. It basically means that there is other configuration available and rtsold(8) can launch an external script specified with the new -O flag to get the DNS options etc. Again, rtsol receives a message with the IPv6 address configuration and the "Other Configuration" bit set, launches something like a dhcpv6 client, which finally gets additional configuration like DNS or NTP addresses. I tested it in conjunction with the WIDE DHCPv6 implementation, thinking about importing it to base or ports at some point.

- relayd(8) is a moving target because I keep on finding new features or code to improve. I can also do some advanced tricks thanks to new interfaces like divert-to, divert-reply, SO_BINDANY, or sloppy states in pf(4). I started working on DSR (Direct-Server-Return) and a transparent mode for relays during n2k8 and finished it in Edmonton during c2k8. There was a nice article from Paul de Weerd describing all the details of these changes. In contrast to the n2k8 where I had a few real machines to test my load balancing with relayd(8), I used a number of virtual machines running in qemu on my laptop to talk to each other. It is fun to have a network "to go" but it is also very slow and my neighbours on the table must have heard my shouting and complains about this...

- The Common Address Redundancy Protocol, also known as carp(4), is a very reliable and proven protocol that is serving HA functionality in many OpenBSD setups since a long time. But there a few edge cases where CARP might cause problems, if your network doesn't like the periodic multicast advertisements or the network policy doesn't allow multicast traffic at all. I was facing both kinds of problems, very specific networks where the Cizzco L2 WAN router or el-cheapo non-name switch went nuts seeing the CARP multicast packets, and OpenBGPd route servers running in an Internet eXchange point where multicast traffic is strictly verboten. I implemented a new option for CARP during c2k8 allowing to send the advertisements to a unicast address of the remote peer. The new "carppeer" option to ifconfig will allow to specify a IPv4 address of the remote CARP peer, it only works in 1:1 setups, or a different IPv4 multicast group instead of the default group 224.0.0.18. The "carppeer" option does not support IPv6, because multicast is kind of mandatory with IPv6 and unicast wouldn't make sense here.

- Well, I also hacked a diff to support extended partitions in disklabel, the boot loader, and the kernel. It allows to place the OpenBSD A6 partition in such an extended partition which might be required if your hard disk is already stuffed with other operating systems. I started working on this before the hackathon, and had some inspiring discussions with Toby (weingart@) about MBRs, EBRs, LBAs, EDDs, CHSes, broken BIOSes, and the world of legacy low level interfaces.

I think that's most of it. Of course I also had a very nice time in Edmonton.

Reyk

I hope that you understand how important supporting the mini-hackathons are for the ongoing development of OpenBSD and related Projects. In my experience, it is not just money but rather a consistent effort by many that make these mini-hackathons as successful as they have been. What can you do to help?

(c2k8 hackathon summary to be continued)

(Comments are closed)


Comments
  1. By Anonymous Coward (194.78.205.247) on

    Speaking of CARP, is there an RFC or some other tech sheet available for it anywhere? Google refuses to point me in the right direction, and the manpage is a bit non-technical...

  2. By Anonymous Coward (89.179.112.231) on

    > What can you do to help?

    Keep on testing stuff and giving feedback.

  3. By jirib (89.176.141.36) to Reyk on

    Thanks a lot for your work.

    Please import dhcpv6 client into base so OpenBSD can work in IPv6 out of box, yeah other option bit looks great!

    Qemu? Did you mean Qemu or Qemu with kqemu? :)

    jirib

    Comments
    1. By jirib (89.176.141.36) on

      and have you heard about vde http://vde.sourceforge.net/ ? it works like virtual ethernet switch which can be of course joined with tun interface... I saw it in FreeBSD ports.

  4. By pkplex (pkplex) on http://127.0.0.1

    With the routing priority's, is it possible to use them in situations where there are two wired internet gateways, and if one becomes unresponsive, it fails over to the other?

    I guess it would be a userland application's job to remove an interfaces 'UP' flag ?

  5. By Anonymous Coward (62.212.210.34) on

    is claudio's cool stuff in the -current already? i've built the system from the sources but ifconfig does not show any prio's and ifconfig(8) mentions nothing about it.

  6. By Martin Toft (130.225.243.84) on

    Should it not be possible to achieve even higher speeds at some point in the future?
    http://www.myri.com/scs/performance/Myri10GE/

    Thanks for your hard work! Have a nice hackaton.

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]