OpenBSD Journal

Call for testing: em(4) TX interrupt mitigation

Contributed by Peter N. M. Hansteen on from the testing needed, em(4)phatically dept.

Are you an OpenBSD user with a low power device such as a PC Engines APU2, with one or more em(4) network interfaces?

Darren Tucker (dtucker@) has a new diff out that may be of use to you, posted in a message to tech@:

List:       openbsd-tech
Subject:    em(4) TX interrupt mitigation
From:       Darren Tucker <dtucker () dtucker ! net>
Date:       2025-05-19 8:52:13

Hi.

TL;DR: if you use em(4), particularly on a low-power device such as a
pcengines APU2, please try this diff.

The em(4) driver has 5 interrupt mitigation timers[0].
In each direction there's a "Packet Timer" that is reset each time a
packet is processed, and an "Absolute Timer" that is reset each time
in interrupt happens.  The Packet Timer lets it wait a little while
for another packet, but the Absolute Timer makes sure it doesn't wait
too long.

In OpenBSD's em(4), these values are (in approximately microseconds):

Transmit Packet Timer (EM_TIDV) = 64
Transmit Absolute Timer (EM_TADV) = 64
Receive Packet Timer (EM_RDTR) = 0
Receive Absolute Timer (EM_RADV) = 64

You will note that the Receive Packet Timer is set to zero, so it
will generate an interrupt for each packet.  This also means that the
corresponding Absolute Timer is also effectively disabled.  There's a
comment that says "CAUTION: When setting EM_RDTR to a value other than 0,
adapters may hang (stop transmitting) under certain network conditions."
We'll examine that one later.

There's also an "Interrupt Throttle Timer" (ITR), which is set
(DEFAULT_ITR) to only allow a maximum of ~8000 interrupts per second,
which is consistent with what you seem in "systat vm 1" on a fully loaded
interface. Since it's at that limit, it would seem that interrupt rates
are a limiting factor.  The interrupt handler processes both TX and RX
regardless of the source of the interrupt.

Looking at the TX interrupt mitigation, the value of 64 seems to have
come from the FreeBSD driver in 2002[1] where TIDV was reduced from
128 to 64 and TADV was added.  How many packets can happen in 64 usec?
At 1Gb, a 1500 byte packet plus its overhead takes (1538*8)/1e9 seconds
= 12.3 usec, so about 5.  But wait, em(2) supports jumbo packets, which
would take (9254*8)/1e9 = 74 usec!  Since this is more than the maximum
holdoff timer, it means we're taking a TX completion interrupt for every
jumbo frame sent.  The TX ring holds 256 or 512 packets depending on
NIC model, so we're not making very effective use of it.

What can we increase this to?  Well the worst case would seem to be
back-to-back transmission of minimum size (64byte) packets at 1Gb/s while
also receiving nothing.  Each packet takes about 0.8 usec, so if we want
to make sure the interface never runs out of packets to transmit we we
need to refill the ring before it's completely empty.  220 should just fit
3 jumbo packets while still leaving a little headroom.  Note that actually
sending traffic while receiving absolutely nothing is difficult to acheive
in practice, since there will likely be replies and various other traffic.

In my testing with iperf an APU2 with TSO disabled and hw.setperf=0, I see
RX go up ~10% (334Mb/s -> 362Mb/s), TX go up ~25% (600Mb/s -> 750Mb/s),
and CPU usage go down by ~60% (nearly 100% of 1 core down to ~40%).

With hw.setperf=100 the speed doesn't change much, but the CPU goes down
by about the same amount.

Comments and test reports welcome.

[0] https://www.intel.com/content/dam/doc/application-note/gbe-controllers-interrupt-moderation-appl-note.pdf
 [1] https://github.com/freebsd/freebsd-src/commit/a58e485d

and the rest of the message has the diff (against -current) that Darren would like your feedback on.

So here's your chance to contribute back to our favorite operating system. If you are able to test this, please go ahead!


Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]