Hidden features of libumem – firewalls and protecting overflows

Jonathan Adam’s excellent libumem(3LIB) as discussed by Adam Leventhal. Is superb at finding user-land memory leaks. One thing I thought was missing though was immediately trapping accesses outside of the allocated memory.

An example of this feature is “Electric Fence”, a Linux based memory allocator, which creates an inaccessible page above (or below) the allocated memory.

We already have this feature in watchmalloc(3MALLOC) where freed memory and the headers are protected using watchpoints. The problem with watchpoints is the performance impact as every memory access needs checking – though page sized memory protection limits the impact somewhat.

To my surprise, libumem does have this feature though it’s currently undocumented. There are two things that need setting in your environment:

  • UMEM_OPTIONS
    • backend=mmap
  • UMEM_DEBUG
    • firewall=1

The first option sets the backend memory allocator to use anonymous memory and mmap(2) rather than the usual heap space and sbrk(2). When libumem uses it makes sure that freed allocations and pages following an allocation are protected.

The second option sets the minimum firewall size to 1. The main effect is to set the UMF_FIREWALL flag internally which ensures that allocations are always at the end of a page. The size, technically, is used to enforce firewalling above a certain size, but AFAICT from the code it doesn’t really matter what this is set to as merely setting any size turns the UMF_FIREWALL flag on.

The only documentation for this is to run strings(1) on the library. I kid you not:

$ strings /usr/lib/libumem.so
...
-- UMEM_OPTIONS --
backend
Evolving
=sbrk for sbrk(2), =mmap for mmap(2)
...
-- end of UMEM_OPTIONS --
-- UMEM_DEBUG --
...
firewall
Private
=minbytes.  Every object >= minbytes in size will have its end against
an unmapped page
...
$

By way of example, here’s some simple code to test this:

/*
* Copyright  Sun Microsystems, Inc.  All rights reserved.
* Use is subject to license terms.
*/
/*
* Very simple program to write beyond the end of a malloc()
* area. Used to test 'firewall' features of memory allocators such as
* libumem(3LIB).
*/
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char **argv)
{
char *buf;	/* Buffer to overflow */
int over = 1;	/* How far to overflow (optional 1st arg) */
int len = 9;	/* How large an array (optional 2nd arg) */
int i;		/* Used to index into the buffer */
if (argc >= 2) {
over = atoi(argv[1]);
}
if (argc >= 3) {
len = atoi(argv[2]);
}
(void) printf("Calling malloc for %d bytes\n", len);
buf = malloc(len);
(void) printf("\tbuf pointer is %lx\n", (unsigned long)buf);
(void) printf("Overwriting single bytes\n");
for (i = len - 1; i < (len + over); i++) {
(void) printf("\tbuf[%d]\t%lx\n", i, (unsigned long)&buf[i]);
buf[i] = 0;
}
(void) printf("Completed without error\n");
return (0);
}

This takes two option arguments, the number of bytes to overflow (default 1) and the size of the allocation (default 9). Running it with normal libumem doesn’t spot the brokenness:

$ LD_PRELOAD=libumem.so UMEM_DEBUG=default ./malloc-overflow 8
Calling malloc for 9 bytes
buf pointer is 45fc8
Overwriting single bytes
buf[8]  45fd0
buf[9]  45fd1
buf[10] 45fd2
buf[11] 45fd3
buf[12] 45fd4
buf[13] 45fd5
buf[14] 45fd6
buf[15] 45fd7
buf[16] 45fd8
Completed without error
$

So now we use our newly found firewalling feature:

$ LD_PRELOAD=libumem.so UMEM_OPTIONS=backend=mmap \
UMEM_DEBUG=default,firewall=1 ./malloc-overflow 8
Calling malloc for 9 bytes
buf pointer is ff1e9ff0
Overwriting single bytes
buf[8]  ff1e9ff8
buf[9]  ff1e9ff9
buf[10] ff1e9ffa
buf[11] ff1e9ffb
buf[12] ff1e9ffc
buf[13] ff1e9ffd
buf[14] ff1e9ffe
buf[15] ff1e9fff
buf[16] ff1ea000
Segmentation Fault(coredump)
$

Neat. Just to show that watchmalloc can do it too:

$ LD_PRELOAD=watchmalloc.so.1 MALLOC_DEBUG=WATCH,RW ./malloc-overflow 8
Calling malloc for 9 bytes
buf pointer is 20bd8
Overwriting single bytes
buf[8]  20be0
buf[9]  20be1
buf[10] 20be2
buf[11] 20be3
buf[12] 20be4
buf[13] 20be5
buf[14] 20be6
buf[15] 20be7
buf[16] 20be8
Trace/Breakpoint Trap(coredump)
$

If you’ve been paying attention you’ll notice that you have to go a number of bytes beyond the allocation before it’s caught. This is because memory allocators tend to do things in minimum chunks, in the above examples – 8 bytes.

My colleague, Tim Uglow has been tinkering with his own allocator which can even catch single byte overflows. He uses the mmap trick but then uses watchpoints to catch the few bytes at the end. What’s even more subtle is that certain internal functions, like strlen(), have some safe optimisations which read beyond the end of the allocated area knowing that it’s at least (say) 8 bytes. Tim sets a write watchpoint precisely at the end of the allocated area but sets the read watchpoint at the next alignment. But then Tim is very clever 🙂

Technorati Tag:
Technorati Tag:

Advertisements
Leave a comment

2 Comments

  1. Anonymous

     /  March 21, 2006

    I simply didn’t get this program:
    for (i = len – 1; i < (len + over); i++) {
    (void) printf(“\tbuf[%d]\t%lx\n”, i, (unsigned long)&buf[i]);
    buf[i] = 0;
    }
    with i=len-1=8, maximum is len+over=9, how does it print out from buf[8]….till buf[16] for you?
    as above program, it looks only print out buf[8], where am I missed?

    Reply
  2. Check how I invoked it …

    $ LD_PRELOAD=libumem.so UMEM_DEBUG=default ./malloc-overflow 8
    

    Notice the 8 on the end? I wrote the program to take optional arguments.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: