什么是“序列点”?

未定义的行为和序列点之间的关系是什么?

我经常用一些有趣和令人费解的表达,比如a[++ I] = I;,来让自己感觉好一点。为什么我应该停止使用它们?

如果您已经阅读了这篇文章,一定要访问后续问题未定义的行为和序列点重新加载。

(注意:这是Stack Overflow的c++常见问题解答的一个条目。如果你想批评在这个表单中提供FAQ的想法,那么在meta上开始这一切的帖子将是这样做的地方。这个问题的答案在c++聊天室中被监控,FAQ的想法最初就是在那里开始的,所以你的答案很可能会被想出这个想法的人读到。)


当前回答

在C99(ISO/IEC 9899:TC3)中,到目前为止似乎没有讨论到以下关于求值顺序的steteents。

[…子表达式的求值顺序和求值的顺序 发生的副作用都未说明。(第6.5条第67页) 操作数的求值顺序未指定。如果有人试图 用于修改赋值操作符的结果或访问它 在下一个序列点之后,行为[原文如此]是未定义的。(部分 6.5.16 pp 91)

其他回答

在C99(ISO/IEC 9899:TC3)中,到目前为止似乎没有讨论到以下关于求值顺序的steteents。

[…子表达式的求值顺序和求值的顺序 发生的副作用都未说明。(第6.5条第67页) 操作数的求值顺序未指定。如果有人试图 用于修改赋值操作符的结果或访问它 在下一个序列点之后,行为[原文如此]是未定义的。(部分 6.5.16 pp 91)

c++ 98和c++ 03

这个答案适用于c++标准的旧版本。c++ 11和c++ 14版本的标准不正式包含“序列点”;操作被改为“预先排序”或“未排序”或“不确定排序”。净效果本质上是相同的,但术语是不同的。


免责声明:好的。这个答案有点长。所以阅读的时候要有耐心。如果你已经知道这些,再读一遍也不会让你抓狂。

先决条件:基本的c++标准知识


What are Sequence Points?

The Standard says

At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)

Side effects? What are side effects?

Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).

For example:

int x = y++; //where y is also an int

In addition to the initialization operation the value of y gets changed due to the side effect of ++ operator.

So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit:

Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.


What are the common sequence points listed in the C++ Standard?

Those are:

  • at the end of the evaluation of full expression (§1.9/16) (A full-expression is an expression that is not a subexpression of another expression.)1

    Example :

    int a = 5; // ; is a sequence point here
    
  • in the evaluation of each of the following expressions after the evaluation of the first expression (§1.9/18) 2

    • a && b (§5.14)
    • a || b (§5.15)
    • a ? b : c (§5.16)
    • a , b (§5.18) (here a , b is a comma operator; in func(a,a++) , is not a comma operator, it's merely a separator between the arguments a and a++. Thus the behaviour is undefined in that case (if a is considered to be a primitive type))
  • at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body (§1.9/17).

1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument

2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.


What is Undefined Behaviour?

The Standard defines Undefined Behaviour in Section §1.3.12 as

behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements 3.

Undefined behavior may also be expected when this International Standard omits the description of any explicit definition of behavior.

3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with- out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.


What is the relation between Undefined Behaviour and Sequence Points?

Before I get into that you must know the difference(s) between Undefined Behaviour, Unspecified Behaviour and Implementation Defined Behaviour.

You must also know that the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.

For example:

int x = 5, y = 6;

int z = x++ + y++; //it is unspecified whether x++ or y++ will be evaluated first.

Another example here.


Now the Standard in §5/4 says

    1. Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

What does it mean?

Informally it means that between two sequence points a variable must not be modified more than once. In an expression statement, the next sequence point is usually at the terminating semicolon, and the previous sequence point is at the end of the previous statement. An expression may also contain intermediate sequence points.

From the above sentence the following expressions invoke Undefined Behaviour:

i++ * ++i;   // UB, i is modified more than once btw two SPs
i = ++i;     // UB, same as above
++i = 2;     // UB, same as above
i = ++i + 1; // UB, same as above
++++++i;     // UB, parsed as (++(++(++i)))

i = (i, ++i, ++i); // UB, there's no SP between `++i` (right most) and assignment to `i` (`i` is modified more than once btw two SPs)

But the following expressions are fine:

i = (i, ++i, 1) + 1; // well defined (AFAIK)
i = (++i, i++, i);   // well defined 
int j = i;
j = (++i, i++, j*i); // well defined

    1. Furthermore, the prior value shall be accessed only to determine the value to be stored.

What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.

For example in i = i + 1 all the access of i (in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.

This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.

Example 1:

std::printf("%d %d", i,++i); // invokes Undefined Behaviour because of Rule no 2

Example 2:

a[i] = i++ // or a[++i] = i or a[i++] = ++i etc

is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.

Example 3 :

int x = i + i++ ;// Similar to above

Follow up answer for C++11 here.

c++ 17 (N4659)包含了一个提议:为惯用c++精炼表达式求值顺序 它定义了更严格的表达式求值顺序。

特别是下面这句话

8.18赋值和复合赋值操作符:.... 在所有情况下,赋值都在值之后排序 左右操作数的计算,以及赋值表达式的值计算之前。 右操作数在左操作数之前排序。

连同以下澄清

一个表达式X被认为是在一个表达式Y之前,如果每 值计算,与表达式X相关的每个副作用都在每个值之前排序 计算和与表达式Y相关的所有副作用。

使几个以前未定义的行为有效的情况,包括有问题的一个:

a[++i] = i;

然而,其他几个类似的情况仍然会导致未定义的行为。

在N4140:

i = i++ + 1; // the behavior is undefined

但是在N4659中

i = i++ + 1; // the value of i is incremented
i = i++ + i; // the behavior is undefined

当然,使用兼容c++ 17的编译器并不一定意味着要开始编写这样的表达式。

这是我之前的回答的后续,并包含c++ 11相关材料..


先决条件:基本的关系学知识(数学)。


c++ 11中真的没有序列点吗?

是的!这是非常正确的。

在c++ 11中,序列点已被前序和后序(以及未序和未定序)关系所取代。


这个“之前排序”到底是什么?

在前面排序(§1.9/13)是一个关系,它是:

不对称 传递

在由单个线程执行的计算之间,并引发严格的部分order1

它的正式意思是给定任意两个评估(见下文)A和B,如果A在B之前排序,那么A的执行将先于B的执行。如果A不在B之前排序,B不在A之前排序,那么A和B是未排序的2。

当A在B之前排序或B在A之前排序时,计算A和B是不确定排序的,但没有指明是哪一个。

(笔记) 1:严格偏序是一个不对称的、可传递的集合P上的二元关系“<”,即对于P中的所有A、b和c,我们有: ……(我)。如果a < b则¬(b < a)(不对称); ……(ii)。如果a < b且b < c,则a < c(可及性)。 2:未排序计算的执行可能会重叠。 3:不确定排序的计算不能重叠,但可以先执行其中任何一个。


在c++ 11的上下文中,“求值”这个词是什么意思?

在c++ 11中,表达式(或子表达式)的求值一般包括:

值计算(包括确定对象的标识进行glvalue计算和获取先前分配给对象的值进行prvalue计算)和 引发副作用。

现在(§1.9/14)说:

与完整表达式相关的每个值计算和副作用都在与下一个要计算的完整表达式相关的每个值计算和副作用之前进行排序。

简单的例子: int x; X = 10; + + x; 在x = 10的值计算和副作用后,对++x相关的值计算和副作用进行排序;


所以在未定义行为和上面提到的事情之间一定存在某种联系,对吧?

是的!正确的。

在(§1.9/15)中已经提到

除非特别注明,个别操作符的操作数的求值和个别表达式的子表达式的求值是无序的4。

例如:

int main()
{
     int num = 19 ;
     num = (num << 3) + (num >> 3);
} 

运算符+的操作数的求值相对于彼此是无序的。 <<和>>操作符的操作数的求值相对于彼此是未排序的。

4:在执行期间被求值不止一次的表达式中 对于一个程序,对其子表达式的未排序和未排序的求值不需要在不同的求值中一致地执行。

(§1.9/15) 类操作数的值计算 在运算符结果的值计算之前对运算符进行排序。

这意味着在x + y中,x和y的值计算在(x + y)的值计算之前排序。

更重要的是

(§1.9/15)如果标量对象上的副作用相对于任何一个都是未排序的 (a)同一标量对象上的另一副作用 或 (b)使用相同标量对象的值进行值计算。 行为是未定义的。

例子:

int i = 5, v[10] = { };
void  f(int,  int);

I = i++ * ++ I;//未定义的行为 I = ++ I + i++;//未定义的行为 I = ++ I ++ + I;//未定义的行为 I = v[i++];//未定义的行为 i = v[++i]: //定义良好的行为 I = i++ + 1;//未定义的行为 I = ++ I + 1;//良好定义的行为 + + + +我;//良好定义的行为 F (i = -1, i = -1);//未定义行为(见下文)

当调用一个函数时(无论该函数是否是内联的),与任何参数表达式或指定被调用函数的后缀表达式相关的每个值计算和副作用都在被调用函数体中的每个表达式或语句执行之前进行排序。[注:与不同参数表达式相关的值计算和副作用是未排序的。-结束注]

表达式(5),(7)和(8)没有调用未定义的行为。查看下面的答案,了解更详细的解释。

c++ 0x中对变量的多个预增量操作 非序列值计算


最后提示:

如果你在帖子中发现任何漏洞,请留下评论。高级用户(代表>20000)请不要犹豫编辑帖子以纠正错别字和其他错误。

我猜这个变化有一个根本的原因,它不仅仅是为了让旧的解释更清楚:这个原因是并发性。未指定的细化顺序只是从几个可能的序列顺序中选择一个,这与之前和之后的顺序有很大的不同,因为如果没有指定的顺序,就可能进行并发计算:旧规则不是这样的。例如:

f (a,b)

以前,要么a然后b,要么b然后a。现在,a和b可以用交叉的指令计算,甚至可以在不同的核上计算。