Contributed by dlg on from the newbs-and-compilers-dont-mix dept.
Every now and then I am totally surprised by an unknown feature or caveat in something I thought I knew everything about. I recently had one of those moments with the printf family of functions when I was trying to get marco's iogen running on linux.
Everyone who programs in C knows about printf. It is the first function anyone ever uses. How else are you supposed to get "Hello World!" onto the screen (and no, please don't tell me about write or puts or such)? I've spent a lot of time in its manpage as well, trying to figure out how to align crazy values and print bitmasks out properly, so I was fairly confident that I could use it without issue. However, it seems I suck.
There are three things that have to be fixed in iogen to get it to run on linux. The first two are functions that are on obsd, but not in linux. The third thing is a little more subtle and is what caused me to scratch my head. With the function issues fixed, iogen was segfaulting on the following chunk:
err_log(0, "file size: %llu io size: %llu read percentage: %i random: %s " "target: %s result: %s update interval: %i", file_size, io_size, read_perc, randomize ? "yes" : "no", target_dir, result_dir, interval);
err_log is basically a wrapper around a printf function, so the way it deals with arguments is the same. After spending some time experimenting with the arguments and commenting bits out, it turns out that it was the file_size and io_size arguments and their format string causing the problem. These two variables are of off_t type, and according to the format string there, are supposed to be unsigned long long values. On openbsd this is almost true. If you go poking around in the headers you'll discover that off_t is a typedef of long long, which is a 64bit wide value. However, on linux off_t is a long int by default and only 32bits wide. The problem with the chunk above is the mismatch between the format string and the size/type of the argument that is supposed to correspond with it.
If you tell the printf that you're going to print out a long long value, it will take a long long (64bit) sized variable off the stack (on i386 anyway) and try to print it out. This is a problem if you've only got a long (32bit) sized variable there. In the best case scenario there will be zeros on the stack and your number will be printed fine, however, this is very unlikely. If you're lucky you'll just get a garbage value printed out, but in the worst case (as I experienced with iogen) you'll get a segfault. The following demonstrates:
$ cat test.c #include <stdio.h> int main(int argc, char *argv[]) { int i = 1, j = 2; printf("%lld %lld\n", i, j); return (0); } $ make test cc test.c -o test $ ./test 8589934593 9664997444 $
As you can see we're not getting what we expect. Fortunately, there is a way to do this properly: know your types!
Most of the time you're going to know what the types of the variables are that you want to print out, and are therefore able to match the format string appropriately. Sometimes you can be unsure or unable (or too lazy) to check to see what is really behind a variables type. In other cases you can get it right on one platform only to have it blow up when you move it to another operating system (eg, off_t). In that case you should proactively cast the argument to the type appropriate for your format string. For example, assume we aren't sure what type an int is:
$ cat test2.c #include <stdio.h> int main(int argc, char *argv[]) { int i = 1, j = 2; printf("%lld %lld\n", (long long)i, (long long)j); return (0); } $ make test2 cc test2.c -o test2 $ ./test2 1 2 $
Such a change would help fix iogen on linux.
I always assumed that printf coerced the type appropriately when you passed them like normal functions do, but no, it turns out that it figures out what sized chunk of mem it wants to read based on the format string. This totally blew my mind when that was explained to me.
Of course, if I was using CFLAGS=-Wformat I would have got warnings about this problem in iogen and discovered it earlier.
The moral of my story is know your types and cast the ones you don't know to match your arguments to your format string.
(Comments are closed)
By Chad Loder (69.111.191.60) on
Comments
By Anonymous Coward (80.65.225.229) on
Or is your awesome reworked lint intended to be portable (ported ?) on other platforms ?
By djm@ (203.217.30.86) on
Comments
By David Gwynne (220.245.180.133) loki@animata.net on
Comments
By tedu (69.12.168.114) on
By David P. (71.48.154.121) on