首先我应该问一下这是否依赖于浏览器。

我曾经读到过,如果发现了一个无效的令牌,但代码段在该无效令牌之前是有效的,如果令牌之前有换行符,则在令牌之前插入一个分号。

然而,常见的由分号插入引起的错误的例子是:

return
  _a+b;

..这似乎不符合这个规则,因为_a将是一个有效的令牌。

另一方面,打破调用链可以正常工作:

$('#myButton')
  .click(function(){alert("Hello!")});

有人对规则有更深入的描述吗?


当前回答

我不能很好地理解规范中的这3条规则——希望有一些更简单的英语——但以下是我从JavaScript: the Definitive Guide,第6版,David Flanagan, O'Reilly, 2011年收集到的内容:

引用:

JavaScript不会把每个换行符都当作分号:它通常只在没有分号时无法解析代码时才把换行符当作分号。

另一个引用:用于代码

var a
a
=
3 console.log(a)

JavaScript不把第二行换行当作分号,因为它可以继续解析更长的语句a = 3;

and:

two exceptions to the general rule that JavaScript interprets line breaks as semicolons when it cannot parse the second line as a continuation of the statement on the first line. The first exception involves the return, break, and continue statements ... If a line break appears after any of these words ... JavaScript will always interpret that line break as a semicolon. ... The second exception involves the ++ and −− operators ... If you want to use either of these operators as postfix operators, they must appear on the same line as the expression they apply to. Otherwise, the line break will be treated as a semicolon, and the ++ or -- will be parsed as a prefix operator applied to the code that follows. Consider this code, for example:

x 
++ 
y

它被解析为x;++y,而不是x++;y

所以我想简化一下,这意味着:

一般来说,JavaScript会将其视为代码的延续,只要它是有意义的——除了两种情况:(1)在一些关键字之后,如return, break, continue,以及(2)如果它在新行上看到++或——,那么它会添加;在前一行的末尾。

关于“只要它有意义,就将其视为代码的延续”的部分让它感觉像是正则表达式的贪婪匹配。

如上所述,这意味着对于带有换行符的返回,JavaScript解释器将插入一个;

(再次引用:如果换行符出现在这些单词[如return]之后…JavaScript总是将换行符解释为分号)

由于这个原因,经典的例子

return
{ 
  foo: 1
}

不会像预期的那样工作,因为JavaScript解释器会把它当作:

return;   // returning nothing
{
  foo: 1
}

在返回之后必须没有换行符:

return { 
  foo: 1
}

让它正常工作。你可以插入一个;如果你要遵循使用a的规则;在任何陈述之后:

return { 
  foo: 1
};

其他回答

我不能很好地理解规范中的这3条规则——希望有一些更简单的英语——但以下是我从JavaScript: the Definitive Guide,第6版,David Flanagan, O'Reilly, 2011年收集到的内容:

引用:

JavaScript不会把每个换行符都当作分号:它通常只在没有分号时无法解析代码时才把换行符当作分号。

另一个引用:用于代码

var a
a
=
3 console.log(a)

JavaScript不把第二行换行当作分号,因为它可以继续解析更长的语句a = 3;

and:

two exceptions to the general rule that JavaScript interprets line breaks as semicolons when it cannot parse the second line as a continuation of the statement on the first line. The first exception involves the return, break, and continue statements ... If a line break appears after any of these words ... JavaScript will always interpret that line break as a semicolon. ... The second exception involves the ++ and −− operators ... If you want to use either of these operators as postfix operators, they must appear on the same line as the expression they apply to. Otherwise, the line break will be treated as a semicolon, and the ++ or -- will be parsed as a prefix operator applied to the code that follows. Consider this code, for example:

x 
++ 
y

它被解析为x;++y,而不是x++;y

所以我想简化一下,这意味着:

一般来说,JavaScript会将其视为代码的延续,只要它是有意义的——除了两种情况:(1)在一些关键字之后,如return, break, continue,以及(2)如果它在新行上看到++或——,那么它会添加;在前一行的末尾。

关于“只要它有意义,就将其视为代码的延续”的部分让它感觉像是正则表达式的贪婪匹配。

如上所述,这意味着对于带有换行符的返回,JavaScript解释器将插入一个;

(再次引用:如果换行符出现在这些单词[如return]之后…JavaScript总是将换行符解释为分号)

由于这个原因,经典的例子

return
{ 
  foo: 1
}

不会像预期的那样工作,因为JavaScript解释器会把它当作:

return;   // returning nothing
{
  foo: 1
}

在返回之后必须没有换行符:

return { 
  foo: 1
}

让它正常工作。你可以插入一个;如果你要遵循使用a的规则;在任何陈述之后:

return { 
  foo: 1
};

首先,你应该知道哪些语句会受到自动分号插入(为简洁起见,也称为ASI)的影响:

空语句 var声明 表达式语句 延伸的声明 继续声明 break语句 返回语句 把语句

ASI的具体规则见§11.9.1自动分号插入规则

本文描述了三个案例:

当遇到语法不允许的违规标记时,在它前面插入分号,如果:

令牌与前一个令牌之间至少用一个LineTerminator分隔。 令牌是}

例如:

    { 1
    2 } 3

转化为

    { 1
    ;2 ;} 3;

NumericLiteral 1满足第一个条件,下面的标记是行结束符。 2满足第二个条件,下面的令牌是}。

当遇到令牌输入流的末尾时,解析器无法将输入令牌流解析为一个完整的程序,则在输入流的末尾自动插入一个分号。

例如:

    a = b
    ++c

转换为:

    a = b;
    ++c;

这种情况发生在文法的某些产物允许一个令牌,但该产物是受限制的产物时,在受限制的令牌之前会自动插入一个分号。

受限制的产品:

    UpdateExpression :
        LeftHandSideExpression [no LineTerminator here] ++
        LeftHandSideExpression [no LineTerminator here] --
    
    ContinueStatement :
        continue ;
        continue [no LineTerminator here] LabelIdentifier ;
    
    BreakStatement :
        break ;
        break [no LineTerminator here] LabelIdentifier ;
    
    ReturnStatement :
        return ;
        return [no LineTerminator here] Expression ;
    
    ThrowStatement :
        throw [no LineTerminator here] Expression ; 

    ArrowFunction :
        ArrowParameters [no LineTerminator here] => ConciseBody

    YieldExpression :
        yield [no LineTerminator here] * AssignmentExpression
        yield [no LineTerminator here] AssignmentExpression

经典的例子,使用ReturnStatement:

    return 
      "something";

转化为

    return;
      "something";

直接来自ECMA-262,第五版ECMAScript规范:

7.9.1 Rules of Automatic Semicolon Insertion There are three basic rules of semicolon insertion: When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true: The offending token is separated from the previous token by at least one LineTerminator. The offending token is }. When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream. When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation "[no LineTerminator here]" within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token. However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).

关于分号插入和var语句,注意在使用var但跨越多行时忘记使用逗号。有人昨天在我的代码中发现了这个:

    var srcRecords = src.records
        srcIds = [];

它运行了,但结果是srcid声明/赋值是全局的,因为在前一行中带有var的局部声明不再应用,因为由于自动插入分号,该语句被认为已完成。

我找到的关于JavaScript自动分号插入的最贴切的描述来自一本关于制作解释器的书。

JavaScript的“自动分号插入”规则是一个奇怪的规则。其他语言认为大多数换行符是有意义的,在多行语句中只有少数换行符应该被忽略,而JS则相反。除非遇到解析错误,否则它将所有换行符视为无意义的空格。如果是,则返回并尝试将前面的换行符转换为分号,以获得语法上有效的内容。

他继续描述它,就像你对气味编码一样。

如果我详细说明这是如何运作的,这篇设计说明就会变成一篇设计攻略,更不用说这是一个坏主意的各种方式了。真是一团糟。JavaScript是我所知道的唯一一种语言,许多风格指南要求在每个语句后显式地使用分号,尽管理论上该语言允许您省略分号。