OpenBSD Journal

s2k15 Hackathon Report: tedu@ on UVM SMP

Contributed by tj on from the not-my-uvm-fault dept.

Our fourth report from the s2k15 hackathon comes from Ted Unangst:

Since s2k15 was, at least for some people, the SMP hackathon, I started my first project in that area. We currently have a few system calls that work without requiring the kernel lock because they only touch isolated parts of the data, but they aren't very exciting. getpid(), for example. I wanted to speed up a system call that may have some noticable results in a workload I use every day: compiling.

Compiling code taxes two kernel subsystems, the filesystem and the virtual memory system. The filesystem code is pretty complex, and to some extent, already serialized by the disk, so I directed my attention to UVM. Specifically, I was looking at the mmap() system call. mmap() is the function used by malloc() to allocate memory, and allocating lots of memory is something compilers are quite fond of. If we could allow multiple compiler invocations to allocate memory concurrently, that would be a great way to speed up some obvious embarassingly parallel tasks, like kernel builds.

Pulling a system call out from the big lock provides two benefits in one. First, the system call isn't blocked from making progress because another process has the lock. Second, some other process won't be blocked because this process has it. And so, for example, even though we focused only on making memory allocation faster, that will at the same time mean that file system access experiences less contention for the lock as well.

Allocating memory actually happens in two parts. First, the process calls mmap() to request an anonymous memory region. Later, when the process reads or writes to that memory, it will trigger a page fault and the kernel will allocate some newly zeroed physical memory for it, then update the CPU's page tables and continue execution. The first of these operations in particular is quite isolated. We merely need to note that a memory address is valid in the process's address space. This requires synchronization with any other potential threads in the same process, but not any other processes.

Sounds easy, and it almost was. The hard part is that mmap() is also used for mapping files into the process. Now we're back to the filesystem code again. The way the UVM map code is structed is that both cases, for anonymous mappings and file mappings, were interleaved through shared functions. There was one sys_mmap() entry point, which called one uvm_mmap() function, which called one uvm_map() function. The uvm_map() function was quite complicated, in addition, because it doesn't just handle userspace address spaces, but is also used to allocate memory in the kernel address space. Trying to make all of this code SMP safe would require adding piles of locks in ways that are difficult to test all at once, leading to a long tail of latent bugs.

Instead the approach I took was to start with uvm_mmap() and turn it into two functions, uvm_mmapanon() and uvm_mmapfile(). Then I deleted all the duplicated code from each function. Then I did the same to uvm_map() and created a special uvm_mapanon() function. Where before we started with code uvm_mmap() that looked like this:

some_function();
if (file) {
     do_file_thing();
} else {
     do_anon_thing();
}
another_function();


We ended with two functions. uvm_mmapanon() now looks like this:

some_function();
do_anon_thing();
another_function();


And uvm_mmapfile() now looks like this:

some_function();
do_file_thing();
another_function();


We're sharing less code, but that's good. It was previously impossible to tell what parts of the function were safe to call without the kernel lock, and what parts were unsafe. Now one can follow the anon mapping code path from syscall entry point all the way down to the lowest layers of UVM through to completion and verify that all the functions called have proper mutexes in place.

At about this time, I handed the diff over to Bob. He and Mark actually did the hard work of adding and removing locks. And testing and testing and testing. Hopefully they'll explain the process, but the short version is that it involved a lot of "Nope, not done yet." followed by a hard reset.

Three was about one more cook than fit in the mmap() kitchen. I turned my attention to uvm_fault() at this time, which is the second half of memory allocation. This is a bit trickier because a fault can happen at almost any time, for many reasons. It's not always even possible to tell exactly what kind of fault one is dealing with until quite a bit of processing has been done, and occasionally faults need to be restarted entirely. Nevertheless, the principle is the same. Extract the code path that probably matters most, for dealing with anonymous memory faults, and simplify and massage it until we are confident it can be run without the lock. This work depends on some other things that Mark and Bob were cooking up, meaning it's not nearly as complete.

UVM seemed safe in their capable hands, so I drifted along to some other projects. The tzcode in libc, which handles timezones, is written to accomodate just about every computer and compiler in the known universe, and as a result can be hard to read. The project has taken a rather dimmer view of such worst common denominator code recently. I spent some time deleting various unnecessary ifdefs and other workarounds from the code so that it blends in better with other, more modern, code in OpenBSD.

I was seated at the same table as Reyk and Joel, which proved helpful for some libtls discussions. We want to keep growing the API so that it is more useful for a broader set of applications, but at the same time we do not want to accumulate too much baggage. It can prove difficult to pare back or revise interfaces after they are adopted, even if they seem promising today, which required some tough calls. Does this function really need to be in the library, or can applications work around its absence?

I also spent a little time poking at the games directory. Notable improvements include the ability to search offensive fortunes and a shapelier b for banner.

(Comments are closed)


Comments
  1. By Russell (65.203.141.66) rustyl@outband.net on

    Thank you tedu. I for one will sleep much better knowing I will no longer have to deal with a unsightly b.

    But all snark aside, My faith in humanity is restored a little every time I read about bcd getting long card support or tetris converted to poll(3). Somebody still cares, even about the odder unloved corners of the system.

    So salutes, and keep fighting the good fight.

  2. By Adam P (190.65.40.204) on

    Thanks Ted! You are awesome!

  3. By Dan (50.184.58.48) danshtr@gmail.com on

    Interesting read. Thanks for the writeup.

    For the uneducated people, what is the overall goal for SMP in the next few years? and what is roughly the roadmap to get there?

Latest Articles

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]