SIGBUS versus SIGSEGV – according to siginfo.h(3head)

Having asked a number of colleagues I failed to find a consistent answer to the question of the differences between SIGBUS and SIGSEGV. According to the Solaris signal(3head) man page we have:

Name             Value   Default     Event
...
SIGBUS           10      Core        Bus Error
SIGSEGV          11      Core        Segmentation Fault

So I dug a bit further and found that siginfo_t can tell you more about the origins of the signal, in particular we have, from the siginfo.h(3head) man page:

Signal         Code                 Reason
_________________________________________________________________________
...
_________________________________________________________________________
SIGSEGV        SEGV_MAPERR          address not mapped to object
SEGV_ACCERR          invalid permissions for mapped object
_________________________________________________________________________
SIGBUS         BUS_ADRALN           invalid address alignment
BUS_ADRERR           non-existent physical address
BUS_OBJERR           object specific hardware error
_________________________________________________________________________

Obviously this may be open to interpretation but that clarifies a few things for me.

For the techie take a look at the OpenSolaris source code for the trap() function. Here we see the handling for various types of trap including page faults. For example, there’s a section where a decision is made as to return SIGBUS or SIGSEGV:

case T_WIN_OVERFLOW + T_USER:	/* window overflow in ??? */
case T_WIN_UNDERFLOW + T_USER:	/* window underflow in ??? */
case T_SYS_RTT_PAGE + T_USER:	/* window underflow in user_rtt */
case T_INSTR_MMU_MISS + T_USER:	/* user instruction mmu miss */
case T_DATA_MMU_MISS + T_USER:	/* user data mmu miss */
case T_DATA_PROT + T_USER:	/* user data protection fault */
switch (type) {
...
/*
* In the case where both pagefault and grow fail,
* set the code to the value provided by pagefault.
*/
(void) instr_size(rp, &addr, rw);
bzero(&siginfo, sizeof (siginfo));
siginfo.si_addr = addr;
if (FC_CODE(res) == FC_OBJERR) {
siginfo.si_errno = FC_ERRNO(res);
if (siginfo.si_errno != EINTR) {
siginfo.si_signo = SIGBUS;
siginfo.si_code = BUS_OBJERR;
fault = FLTACCESS;
}
} else { /* FC_NOMAP || FC_PROT */
siginfo.si_signo = SIGSEGV;
siginfo.si_code = (res == FC_NOMAP) ?
SEGV_MAPERR : SEGV_ACCERR;
fault = FLTBOUNDS;
}

I was digging around this following a discussion regarding bug 6466257 (mmap file writing fails on nfs3 client with EMC nas device) and the signals delivered by mmap(2). The man page suggests that either SIGBUS or SIGSEGV can be returned for a number of error conditions but doesn’t seem sure which. The answer is, “it depends”.

So my conclusion is sadly another question – can an application developer infer anything from a SIGBUS versus a SIGSEGV? The answer, I believe, is yes – but quite often the result is the same which is to fix the code 🙂

Advertisements
Leave a comment

3 Comments

  1. Kent Wilson

     /  December 8, 2006

    Based on experience from many years ago (and not having access to source code and the time) I learned that SIGSEGV tended to mean that you either dereferenced a NULL pointer which would mean you are trying to access a non-existant segment or that you generated and were trying to use an address which was pointing into the “text” segment. In contrast SIGBUS basically meant that you were trying to use an address which was illegal (i.e. outside the ability of the manhine to address).

    Reply
  2. alex

     /  March 17, 2007

    On personal opinion, I find this very helpful.
    Guys, I have also posted some more relevant info further on this, not sure if you find it useful:
    http://www.bidmaxhost.com/forum/

    Reply
  3. Peter Harvey

     /  March 30, 2007

    On a recent SGR course I was teaching I was asked by one of the participants about this entry. IIRC he was questioning whether a SIGBUS could ever be generated by a programming problem.
    The short answer is yes.
    Check this code which has a mis-aligned pointer dereference:

    #include <stdio.h>
    int main(int argc, char **argv)
    {
    int testvar = 0x12345678;
    int *testvarp;
    testvarp = &testvar;
    printf("testvarp was %lx\n", testvarp);
    printf("testvar is %lx\n", *testvarp);
    testvarp = (int *)(((char *)testvarp) + 1);
    printf("testvarp is %lx\n", testvarp);
    printf("testvar is %lx\n", *testvarp);
    return(0);
    }
    

    Compiling for SPARC v8 or v9 has different results:

    $ cc -o sigbus-demo sigbus-demo.c
    $ ./sigbus-demo
    testvarp was ffbfebb4
    testvar is 12345678
    testvarp is ffbfebb5
    testvar is 34567800
    $ cc -xarch=v9 -o sigbus-demo sigbus-demo.c
    $ ./sigbus-demo
    testvarp was ffffffff7fffe98c
    testvar is 12345678
    testvarp is ffffffff7fffe98d
    zsh: bus error (core dumped)  ./sigbus-demo
    $
    

    So, a programming problem can cause a SIGBUS. How this all works in practise is an exercise I’m happy (for now) to leave to the reader.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: