blog

Error Handling in C without using goto

What is error handling?

In a programming language like C where there is no support for exception handling and garbage collection, the programmers have to make sure that a program comes out of an error condition gracefully. On encountering an error a program must rollback and free-up various resources allocated during the course of execution. Ideally, the program should get back to a state where it was consistent before the error happened.

 

Why error handling is important?

Absence of error handling or a buggy error handling can lead a program or the system as whole to an inconsistent state. Often many fatal programming bugs such as memory leaks, dead locks, data race conditions etc. are the results of an improper error handling. Especially in a programming environment where resources are scarce and margin of error is thin adapting to a good error handling technique becomes even more important.

 

Common error handling techniques

There are mainly two techniques used by professional C programmers to handle errors. The first one is where a program on encountering an error un-dos the changes and returns from the same code location. As a sample following is a C function that inserts some string in a linked list node if the string is already not present. It returns pointer to head of the linked list in case of success or NULL otherwise. Notice the error handling technique used in the program.

struct lnode {
        char *str;
        struct lnode *next;
};

struct lnode *insert(char *data, int len, struct lnode *list) {
        struct lnode *p, *q;

        p = (struct lnode *)malloc(sizeof(struct lnode));
        if ( NULL == p ) {
                return NULL;
        }

        p->str = (char *)malloc(sizeof(char)*len);
        if ( NULL == p->str ) {
                // free node before returning.
                free ( p );
                return NULL;
        }

        memcpy ( p->str, data, len );

        if(NULL == list) {
                p->next = NULL;
                list = p;
        } else {
                q = list;
                while(q->next != NULL) {
                        // check for duplicates
                        if (0 == strcmp(q->str,p->str)) {
                                // free string and node
                                free(p->str);
                                free(p);
                                return NULL;
                        }
                        q = q->next;
                }
                p->next = q->next;
                q->next = p;
        }
        return list;
}

In this case error is handled by freeing up memory allocated so far and returning from the same place. It is a very simple way of handling error gracefully but it has several demerits. It makes the function look complicated as it has multiple exit points (return statements). In this approach if the error handling part of the code grows large then the code can become really unmanageable. Thus such a technique of error handling can be used, at the most, only in functions which are quite small in size.

The second type of error handling technique, which is very popular among Linux Kernel developers, is implemented by using C statement goto. In case of an error the program control jumps to a suitable error handling block of the function, does necessary cleanup and returns from there. The same list insert sample function has been modified to show the goto error handling technique, following is the modified code.

struct lnode *insert(char *data, int len, struct lnode *list) {
        struct lnode *p, *q;

        p = (struct lnode *)malloc(sizeof(struct lnode));
        if ( NULL == p ) {
                // unable to allocate
                goto out;
        }

        p->str = (char *)malloc(sizeof(char)*len);
        if ( NULL == p->str ) {
                // unable to allcoate
                goto out_free_p;
        }

        memcpy ( p->str, data, len );

        if(NULL == list) {
                p->next = NULL;
                list = p;
        } else {
                q = list;
                while(q->next != NULL) {
                        // check for duplicates
                        if (0 == strcmp(q->str,p->str)) {
                                //duplicate entry found
                                goto out_free_str;
                        }
                        q = q->next;
                }
                p->next = q->next;
                q->next = p;
        }

        // we are here that  means success
        return list;

out_free_str:
        free(p->str);
out_free_p:
        free(p);
out:
        return NULL;
}

The biggest advantage of this technique over the previous one is that it has a separate block where all error handling code is kept and  has a single exit point. This makes code quite readable as well (of course with the help of nice lable names). While writing an error handling block the order in which cleanup has to be done should be taken in account (this is true for all error handling techniques). An incorrect order can lead to problems like memory leak or even deadlocks. The thumb rule is that the cleanup should be done in exactly the reverse order as compared to the normal flow of program. Thus a programmer adding a new block of code that allocates some resource has to make sure that he adds a corresponding error handling block taking care of the order in which resources were allocated. Often in large functions this ordering is not noticed correctly thereby introducing nasty bugs! .

Error handling in C without goto

It is possible to write error handling code without using goto statement yet preserving all the advantages of the previous error handling techniques. Let us start rewriting our sample C function using new error handling method. To start with, let us have a local varibale, sayuint8_t good, to keep track of the current state of the program in the current function. Also, let us have a structure, say struct cleanup, that keeps track of all allocated resources in the current function.

        uint8_t good;
        struct {
                uint8_t alloc_node : 1;
                uint8_t alloc_str : 1;
        } cleanup = { 0, 0 };

All the fields of the cleanup structure is cleared in the beginning. The fields are set one by one as soon as the corresponding resource is allocated. At the same time the state tracking variable good is set or unset according to success or failure in acquiring that resource. In our linked list example the code to allocate memory for linked list node will look like this

       p = (struct lnode *)malloc(sizeof(struct lnode));
       good = cleanup.alloc_node = (p != NULL);

As you can see the variable good is set or cleared automatically depending upon return value of the malloc function. Thus it correctly stores the current state of program (ie. good or not good). The program should proceed only if the current is good. Hence the code that allocates memory for string in our program should look like this

       if (good) {
                p->str = (char *)malloc(sizeof(char)*len);
                good = cleanup.alloc_str = (p->str != NULL);
        }

In the end just before returning from the current function there has to be an error handling block that takes care of cleanup if the program is not in good state.

      if(!good) {
                if(cleanup.alloc_str)   free(p->str);
                if(cleanup.alloc_node)  free(p);
        }

        // good? return list or else return NULL
        return (good? list: NULL);

Though the order in which the cleanup is done in the error handling block is still important. But in this technique there is one advantage, a programmer can easily verify the order of resource allocation by looking at the cleanup structure field (provided that they are in correct order). The final code rewritten looks like this:-

struct lnode *insert(char *data, int len, struct lnode *list) {
        struct lnode *p, *q;
        uint8_t good;
        struct {
                uint8_t alloc_node : 1;
                uint8_t alloc_str : 1;
        } cleanup = { 0, 0 };

        // allocate node.
        p = (struct lnode *)malloc(sizeof(struct lnode));
        good = cleanup.alloc_node = (p != NULL);

        // good? then allocate str
        if (good) {
                p->str = (char *)malloc(sizeof(char)*len);
                good = cleanup.alloc_str = (p->str != NULL);
        }

        // good? copy data
        if(good) {
                memcpy ( p->str, data, len );
        }

        // still good? insert in list
        if(good) {
                if(NULL == list) {
                        p->next = NULL;
                        list = p;
                } else {
                        q = list;
                        while(q->next != NULL && good) {
                                // duplicate found--not good
                                good = (strcmp(q->str,p->str) != 0);
                                q = q->next;
                        }
                        if (good) {
                                p->next = q->next;
                                q->next = p;
                        }
                }
        }

        // not-good? cleanup.
        if(!good) {
                if(cleanup.alloc_str)   free(p->str);
                if(cleanup.alloc_node)  free(p);
        }

        // good? return list or else return NULL
        return (good? list: NULL);
}

 

Conclusion

To summarize, there are several advantages of using using above described error handling technique. The important ones are mentioned below:-

  • It eliminates goto statement from the program.
  • Determining order of cleanup becomes easier.
  • There is only one exit point in the function, even if there is an error.
  • It makes the code more readable and intuitive.

Categorised as: technology


2 Comments

  1. Dmitry says:

    There is a another way to avoid gotos in contruction/rollback code.
    We created a set of macros that hides details of initialization failure handling and rollbacks and allows to write straightforward code like:

    CONSTRUCT_STEP(step1, init_step_func1(…))
    CONSTRUCT_STEP(step1, init_step_func2(…))
    CONSTRUCT_STEP(step1, init_step_func3(…))
    .
    .
    .
    ON_FAILURE(destruct_func())

    Macros track steps succeeded and performs proper rollback without touching uninitialized parts.
    Also it catches forgotten cleanups and improper order.
    Moreover it allows to simulate failure of each step and verify rollback correctness.
    There are a lot of macros’, but code is excellently readable and error-proof

    Check CStart library (free and open source):
    http://www.daynix.com/index.php?option=com_content&view=article&id=62&Itemid=71
    https://github.com/daynix/bricklets/wiki/CSteps-bricklet-documentation

  2. The problem of keeping code style #2 from rotting can be automated fairly easily using the tooling infrastructure of the upcoming Clang release. I’ve blogged about it at
    http://blog.zoom.nu/2012/09/goto-checker.html

Leave a Reply

Your email address will not be published. Required fields are marked *

*


8 + four =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>