我一直有一个印象,如果可能的话,永远不要使用goto。

然而,在前几天阅读libavcodec(它是用C编写的)时,我惊讶地注意到它的多种用法。

在支持循环和函数的语言中使用goto是否有优势?如果有,为什么?请提供一个具体的例子,清楚地说明使用goto的理由。


当前回答

这些年来,我写了不少汇编语言。最终,每一种高级语言都被编译成gotos。好吧,叫它们“分支”或“跳跃”或其他什么,但它们是gotos。有人能写无goto汇编器吗?

当然,你可以向Fortran、C或BASIC程序员指出,gotos的泛滥就像意大利肉酱面一样。然而,答案不是避免它们,而是小心地使用它们。

刀可以用来准备食物,解救某人,或者杀死某人。我们会因为害怕后者而没有刀吗?同样,“后向”:不小心使用它会碍事,小心使用它会有所帮助。

其他回答

我们使用的goto规则是,goto可以跳转到函数中的单个退出清理点。在真正复杂的函数中,我们放松了这个规则,允许其他跳转。在这两种情况下,我们都避免了经常在错误代码检查中出现的深度嵌套的if语句,这有助于可读性和维护。

我发现do{} while(false)的用法完全令人反感。它可能会让我相信它在某些奇怪的情况下是必要的,但从来没有让我相信它是干净合理的代码。

如果必须执行这样的循环,为什么不显式地依赖标志变量呢?

for (stepfailed=0 ; ! stepfailed ; /*empty*/)

它有时在按字符进行字符串处理时很方便。

想象一下这样一个printf式的例子:

for cur_char, next_char in sliding_window(input_string) {
    if cur_char == '%' {
        if next_char == '%' {
            cur_char_index += 1
            goto handle_literal
        }
        # Some additional logic
        if chars_should_be_handled_literally() {
            goto handle_literal
        }
        # Handle the format
    }
    # some other control characters
    else {
      handle_literal:
        # Complicated logic here
        # Maybe it's writing to an array for some OpenGL calls later or something,
        # all while modifying a bunch of local variables declared outside the loop
    }
}

您可以将goto handle_literal重构为一个函数调用,但如果它修改了几个不同的局部变量,则必须将引用传递给每个局部变量,除非您的语言支持可变闭包。如果您的逻辑使else case不起作用,那么您仍然必须在调用之后使用continue语句(可以说是goto的一种形式)以获得相同的语义。

我还在lexer中明智地使用了gotos,通常用于类似的情况。大多数时候你不需要它们,但在一些奇怪的情况下有它们很好。

我在以下情况下使用goto: 当需要从不同位置的函数返回时,并且在返回之前需要进行一些初始化:

non-goto版本:

int doSomething (struct my_complicated_stuff *ctx)    
{
    db_conn *conn;
    RSA *key;
    char *temp_data;
    conn = db_connect();  


    if (ctx->smth->needs_alloc) {
      temp_data=malloc(ctx->some_size);
      if (!temp_data) {
        db_disconnect(conn);
        return -1;      
        }
    }

    ...

    if (!ctx->smth->needs_to_be_processed) {
        free(temp_data);    
        db_disconnect(conn);    
        return -2;
    }

    pthread_mutex_lock(ctx->mutex);

    if (ctx->some_other_thing->error) {
        pthread_mutex_unlock(ctx->mutex);
        free(temp_data);
        db_disconnect(conn);        
        return -3;  
    }

    ...

    key=rsa_load_key(....);

    ...

    if (ctx->something_else->error) {
         rsa_free(key); 
         pthread_mutex_unlock(ctx->mutex);
         free(temp_data);
         db_disconnect(conn);       
         return -4;  
    }

    if (ctx->something_else->additional_check) {
         rsa_free(key); 
         pthread_mutex_unlock(ctx->mutex);
         free(temp_data);
         db_disconnect(conn);       
         return -5;  
    }


    pthread_mutex_unlock(ctx->mutex);
    free(temp_data);    
    db_disconnect(conn);    
    return 0;     
}

goto版本:

int doSomething_goto (struct my_complicated_stuff *ctx)
{
    int ret=0;
    db_conn *conn;
    RSA *key;
    char *temp_data;
    conn = db_connect();  


    if (ctx->smth->needs_alloc) {
      temp_data=malloc(ctx->some_size);
      if (!temp_data) {
            ret=-1;
           goto exit_db;   
          }
    }

    ...

    if (!ctx->smth->needs_to_be_processed) {
        ret=-2;
        goto exit_freetmp;      
    }

    pthread_mutex_lock(ctx->mutex);

    if (ctx->some_other_thing->error) {
        ret=-3;
        goto exit;  
    }

    ...

    key=rsa_load_key(....);

    ...

    if (ctx->something_else->error) {
        ret=-4;
        goto exit_freekey; 
    }

    if (ctx->something_else->additional_check) {
        ret=-5;
        goto exit_freekey;  
    }

exit_freekey:
    rsa_free(key);
exit:    
    pthread_mutex_unlock(ctx->mutex);
exit_freetmp:
    free(temp_data);        
exit_db:
    db_disconnect(conn);    
    return ret;     
}

当您需要更改释放语句中的某些内容时(每个语句在代码中使用一次),第二个版本使它变得更容易,并减少了在添加新分支时跳过其中任何一个语句的机会。在函数中移动它们在这里不会有帮助,因为可以在不同的“级别”进行释放。

1) The most common use of goto that I know of is emulating exception handling in languages that don't offer it, namely in C. (The code given by Nuclear above is just that.) Look at the Linux source code and you'll see a bazillion gotos used that way; there were about 100,000 gotos in Linux code according to a quick survey conducted in 2013: http://blog.regehr.org/archives/894. Goto usage is even mentioned in the Linux coding style guide: https://www.kernel.org/doc/Documentation/CodingStyle. Just like object-oriented programming is emulated using structs populated with function pointers, goto has its place in C programming. So who is right: Dijkstra or Linus (and all Linux kernel coders)? It's theory vs. practice basically.

There is however the usual gotcha for not having compiler-level support and checks for common constructs/patterns: it's easier to use them wrong and introduce bugs without compile-time checks. Windows and Visual C++ but in C mode offer exception handling via SEH/VEH for this very reason: exceptions are useful even outside OOP languages, i.e. in a procedural language. But the compiler can't always save your bacon, even if it offers syntactic support for exceptions in the language. Consider as example of the latter case the famous Apple SSL "goto fail" bug, which just duplicated one goto with disastrous consequences (https://www.imperialviolet.org/2014/02/22/applebug.html):

if (something())
  goto fail;
  goto fail; // copypasta bug
printf("Never reached\n");
fail:
  // control jumps here

使用编译器支持的异常也会出现同样的错误,例如在c++中:

struct Fail {};

try {
  if (something())
    throw Fail();
    throw Fail(); // copypasta bug
  printf("Never reached\n");
}
catch (Fail&) {
  // control jumps here
}

But both variants of the bug can be avoided if the compiler analyzes and warns you about unreachable code. For example compiling with Visual C++ at the /W4 warning level finds the bug in both cases. Java for instance forbids unreachable code (where it can find it!) for a pretty good reason: it's likely to be a bug in the average Joe's code. As long as the goto construct doesn't allow targets that the compiler can't easily figure out, like gotos to computed addresses(**), it's not any harder for the compiler to find unreachable code inside a function with gotos than using Dijkstra-approved code.

(**) Footnote: Gotos to computed line numbers are possible in some versions of Basic, e.g. GOTO 10*x where x is a variable. Rather confusingly, in Fortran "computed goto" refers to a construct that is equivalent to a switch statement in C. Standard C doesn't allow computed gotos in the language, but only gotos to statically/syntactically declared labels. GNU C however has an extension to get the address of a label (the unary, prefix && operator) and also allows a goto to a variable of type void*. See https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html for more on this obscure sub-topic. The rest of this post ins't concerned with that obscure GNU C feature.

标准C(即未计算的)goto通常不是无法在编译时找到不可达代码的原因。通常的原因是如下所示的逻辑代码。鉴于

int computation1() {
  return 1;
}

int computation2() {
  return computation1();
}

对于编译器来说,在以下3种结构中找到不可访问的代码同样困难:

void tough1() {
  if (computation1() != computation2())
    printf("Unreachable\n");
}

void tough2() {
  if (computation1() == computation2())
    goto out;
  printf("Unreachable\n");
out:;
}

struct Out{};

void tough3() {
  try {
    if (computation1() == computation2())
      throw Out();
    printf("Unreachable\n");
  }
  catch (Out&) {
  }
}

(请原谅我使用了与大括号相关的编码风格,但我试图使示例尽可能紧凑。)

Visual c++ /W4(即使使用/Ox)也无法在这些类型中找到无法到达的代码,而且正如您可能知道的那样,寻找无法到达的代码的问题通常是无法确定的。(如果你不相信我的话:https://www.cl.cam.ac.uk/teaching/2006/OptComp/slides/lecture02.pdf)

As a related issue, the C goto can be used to emulate exceptions only inside the body of a function. The standard C library offers a setjmp() and longjmp() pair of functions for emulating non-local exits/exceptions, but those have some serious drawbacks compared to what other languages offer. The Wikipedia article http://en.wikipedia.org/wiki/Setjmp.h explains fairly well this latter issue. This function pair also works on Windows (http://msdn.microsoft.com/en-us/library/yz2ez4as.aspx), but hardly anyone uses them there because SEH/VEH is superior. Even on Unix, I think setjmp and longjmp are very seldom used.

2) I think the second most common use of goto in C is implementing multi-level break or multi-level continue, which is also a fairly uncontroversial use case. Recall that Java doesn't allow goto label, but allows break label or continue label. According to http://www.oracle.com/technetwork/java/simple-142616.html, this is actually the most common use case of gotos in C (90% they say), but in my subjective experience, system code tends to use gotos for error handling more often. Perhaps in scientific code or where the OS offers exception handling (Windows) then multi-level exits are the dominant use case. They don't really give any details as to the context of their survey.

编辑补充:这两种使用模式出现在Kernighan和Ritchie的C语言书的第60页左右(取决于版本)。另一件值得注意的事情是,这两个用例都只涉及forward goto。MISRA C 2012版(不像2004版)现在允许goto,只要它们是向前的。