OpenBSD Journal

clang -fret-clean: cleaning return addresses off stack (by deraadt@)

Contributed by Peter N. M. Hansteen on from the Puffy cleans your stack dept.

Future versions of OpenBSD may include core system libraries and binaries built with logic to remove return addresses off the stack. With this in place, whole classes of bugs would be harder to exploit.

In a message to the tech@ mailing list titled clang -fret-clean: cleaning return addresses off stack, Theo de Raadt (deraadt@) explains how this would work and includes code to implement the feature for the X86 architecture only:

List:       openbsd-tech
Subject:    clang -fret-clean: cleaning return addresses off stack
From:       "Theo de Raadt" <deraadt () openbsd ! org>
Date:       2024-05-25 6:18:59

There are many address space mitigations in play now which make standard
control-flow methods and ROP-style methods more difficult than ever before.
None of them are a silver bullet; added up they are a big deal, but noone
is saying they are a comprehensive solution,

One thing I've worried about for a while is that program bugs being
exercised tend to happen in the main program, or in some large library.
But many types of attack methodology require reaching system calls via
libc, in as direct and simple fashion as possible.  ASLR location of
libc has made that a bit harder, boot-time random relinking of libc
makes it even more difficult.  But there's a few things which do hint at
where libc is mapped.
The GOT/PLT point at functions in it, and indeed specific libc methods are
reached via them.  The introduction of Xonly (now on most of our architectures)
has made that a bit harder to play with.  But anyways... GOT/PLT changes is
not where I'm going in this mail.

The other thing is that the dead-part of the stack contains specific
addresses to libc code.  Imagine if just before the bug being tickled,
previous code has done a fairly deep and local-variable heavy
call-stack, which made it into some libc interface -- it could be a
system call, or something higher level like stdio.  Upon return closer
to the bug, the dead stack will contain return addresses (the address
after those "callq" inside libc).  An attacker can know the absolute
offset relative to the stack where these values are, and in their early
attack code load those values, and find a way to use them.

Random relinking of libc, and a historical pattern of turning libc into
tons of little .o files, means the attacker can't neccessarily call
other functions, but they can play with functions in the call graph.
They can't read the relevant callq instruction to inspect the called
address, because the text segment isn't readable (hello Xonly), so they
can't directly call what was being called.

So there are two possible outcomes.

1) they have something like printf -> fprintf -> __vfprintf, and hooks to
the actual output functions, a bit difficult to play with, but it's "not nothing".

2) or imagine they have the actual libc.so address space known (because this
is not a remote attack), they now know the offsets of all the system call stubs,
relative to this callq return.

So here's a demo / proposal, for x86 only at first.  This is a new
-fret-clean option in clang, which is used to compile the following:

   libc, libcrypto, ld.so, all the ssh binaries, and the kernel

It is a compiler pass that changes the following:

   callq	 something

into

   callq	 something
   movq		 $0, -8(%rsp)

The callq instruction puts the location of the movq instruction onto the
stack for returning to.  The change is that now, upon return to caller,
the caller itself cleans that value off the stack.  If I got it right :)

A review of the stack after this shows substantial sanitization.  This
doesn't fix all pointers you can see on the stack, just this one type of
'pointer'.  Local variable pointers inside call frames will remain, but
that's harder for an attacker to exercise.

I'm not proposing that we compile many main programs with this.  The retval
data hiding being proposed is most useful near the syscall boundary in libc.so
(and ld.so).  In dynamic libc.so, pinsyscalls(2) prevents one syscall stub
from being a chimera for other operations, but you will still find all the
stubs in libc.so lying around in some random place.

I haven't written a version for any other architectures yet, but I have
a vague idea of what to do.  Some of them use link-register calling
convention, and the retval cleaning will be in the caller, not the
callee.  That's a little bit better because it also hides the retval one
frame higher.

Anyways, food for thought.

The message also includes the actual patch against -current code to implement the functionality, on X86 only for now.

It is worth noting that it's early days yet, and code for other architectures is not yet available. But this points ahead to a what could happen to make bugs in programs running on our favorite operating system even harder to exploit.


Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]