I use this blog as a soap box to preach (ahem... to talk :-) about subjects that interest me.

Friday, December 24, 2010

A programmer's Taboo

This post is about GOTOs.


Machine and assembly languages have jump instructions. No CPU could work without them. When the first high-level languages were developed, it was obvious that there would be an instruction to transfer control from one point to another. That’s why all procedural languages had it: Fortran, Cobol, Basic, C...

Then, with the advent first of structured programming and then of object-oriented programming, GOTOs came into disrepute and eventually vanished altogether. Most notably, Java doesn’t have them.

Even in languages that still have GOTOs, like C and C++, programmers consider them taboo. The teachers of programming courses mark down assignments containing GOTOs and programmers are appalled when one of their colleagues uses them.

And yet, by rejecting the irrational reactions due to cultural brainwashing, one must come to the conclusion that there are situations where GOTOs make for more readable and more maintainable code. Look for example at the following piece of code in C:

if (condition1) {
       whatever1
        if (condition2) {
               whatever2
               if (condition3) {
                      whatever3
                      if (condition4) {
                              whatever5
                              if (condition5) {
                                      whatever5
                                      if (condition6) {
                                              whatever6
                                              if (condition7) {
                                                      whatever7
                                                      }
                                              else {
                                                      rest-of-the-program
                                                      }
                                              }
                                      }
                              }
                      }
               }
       }

This occurs when you need to ensure that a series of conditions are verified before performing a certain operation.  If you want to be able to print your code in portrait mode, you are left with little space left for your actual program, squeezed against the right margin of your page. I actually limit my statements to a maximum length of 80 characters. Perhaps it is because that was the number of characters that could fit in a punched card, but that doesn’t really matter in this context.

Now compare the code above with the equivalent code done with GOTOs:

if (condition1) {
         whatever1
         goto get_out;                                                                                           //==>
if (condition2) {
         whatever2
         goto get_out;                                                                                           //==>
if (condition3) {
         whatever3
         goto get_out;                                                                                           //==>
if (condition4) {
         whatever4
         goto get_out;                                                                                           //==>
if (condition5) {
         whatever5
         goto get_out;                                                                                           //==>
if (condition6) {
         whatever6
         goto get_out;                                                                                           //==>
if (condition7) {
         whatever7
         goto get_out;                                                                                           //==>

rest-of-the-program

get_out:
return-here

It clearly is a better piece of code. At least, that’s what I think, and that’s what IMNSHO everybody capable of thinking with hir own head should think. Where are the ‘dangers’ of using GOTOs? The little comments at the end of each line with a GOTO-statement clearly identifie the jumps and remove any risk of overlooking them.

If you are scared by GOTOs but would like to be scared more, keep reading. I will introduce you to the dark world of long jumps.

C allows you to save the environment in a location within your program and return to it from anywhere else within the same program ignoring normal function calls and returns. You understood correctly: choose a statement and then jump to it from wherever you are within your program. Don’t worry if you feel that your hair is beginning to rise at the back of your neck. It’s a natural reaction. This is one of the few really scary things I have encountered, the other one that comes to mind being self-modifying code.

This is how you set up the long jump:

#include
jmp_buf env;
...
if (setjump(env) == 0) {
       // get here on direct call
       }
else {
       // get here by calling longjmp
       }

Then, from somewhere else in the program, you can do the following:

#include
extern jmp_buf env;
...
int retval = 3;
longjmp(env, retval);
...

When you execute longjmp(), the program environment saved when you executed setjmp() is restored and setjmp() returns 3, while when you executed it directly it had returned 0. Obviously, by choosing different return values for setjmp() when you execute longjmp(), you can ‘remember’ where the long jump is coming from.

Nothing prevents you from saving several contexts with different names and jump to them from all sort of places, but a criss-crossing of long jumps wouldn’t result in an easily maintainable program.

And yet, in my decades of programming experience, I encountered a situation where without long jumps I would have been in trouble. It was in the early ‘90s, when I was developing MacDOS, a program that provided a DOS-like command line interface for the Macintosh. setjmp/longjmp was the only way I found to implement a control-C mechanism.

Imagine: wherever you are within the program, if the user presses cntl-C, control must hop out of all nested functions and return to the top-level prompt. This is an intrinsically unstructured situation that requires a long jump. No way around it. Without the mechanism provided by C, I would have had to create my own long-jump in assembly language and then embedding it into C (incidentally, this is another thing I find really scary).

The morale of all this (yes, there is a morale) is that a savvy programmer should not be scared by some constructs to the point of dismissing them off hand. Happy programming!

2020-04-02: I had included a link to a description of My MacDOS project that included the full source code, but a few years ago I closed that web site. I intend to extend my current web site zambon.com.au to include my programming, but I haven't done it yet. In the meanwhile, here is the piece of code within MacDOS.c where I set up the long jump:
    if (setjmp(env_longjmp) != 0) {

      //
      // We have returned here from a user break within io_fprintf.
      // io_fprintf has already reported the error.
      //
      err = DOSERR_NO;
      io_pageOpt = false;
      //
      // close the files possibly opened by the aborted command
      //
      if (nFreeFiles > 0) io_closeList(nFreeFiles, freeFiles);
      goto FREE_CHECK_LBL;                                  // 2-->
      }
    env_longjmpSet = true;
    }

Notice that I closed all the files that might have been open when the cntl-C was pressed!
Other clearing operations were done after FREE_CHECK_LBL.

No comments:

Post a Comment