让我尝试说明将指针传递给对象的不同可行模式,这些对象的内存由std::unique_ptr类模板的实例管理;它也适用于旧的std::auto_ptr类模板(我相信它允许所有人使用唯一指针,但是在需要右值的地方可以接受可修改的左值,而不必调用std::move),在某种程度上也适用于std::shared_ptr。
作为讨论的一个具体示例,我将考虑以下简单的列表类型
struct node;
typedef std::unique_ptr<node> list;
struct node { int entry; list next; }
Instances of such list (which cannot be allowed to share parts with other instances or be circular) are entirely owned by whoever holds the initial list pointer. If client code knows that the list it stores will never be empty, it may also choose to store the first node directly rather than a list.
No destructor for node needs to be defined: since the destructors for its fields are automatically called, the whole list will be recursively deleted by the smart pointer destructor once the lifetime of initial pointer or node ends.
This recursive type gives the occasion to discuss some cases that are less visible in the case of a smart pointer to plain data. Also the functions themselves occasionally provide (recursively) an example of client code as well. The typedef for list is of course biased towards unique_ptr, but the definition could be changed to use auto_ptr or shared_ptr instead without much need to change to what is said below (notably concerning exception safety being assured without the need to write destructors).
传递智能指针的模式
模式0:传递一个指针或引用参数,而不是智能指针
如果你的函数与所有权无关,这是最好的方法:完全不要让它使用智能指针。在这种情况下,你的函数不需要担心谁拥有所指向的对象,或者通过什么方式管理所有权,所以传递原始指针是非常安全的,也是最灵活的形式,因为不管所有权如何,客户端总是可以产生一个原始指针(通过调用get方法或从address-of操作符&)。
例如,计算这样一个列表长度的函数,不应该给出一个列表参数,而是一个原始指针:
size_t length(const node* p)
{ size_t l=0; for ( ; p!=nullptr; p=p->next.get()) ++l; return l; }
持有变量list head的客户端可以调用这个函数length(head.get()),
而选择存储节点n表示非空列表的客户端可以调用length(&n)。
如果指针保证为非空(这里不是这样,因为列表可能是空的),则可能更倾向于传递引用而不是指针。如果函数需要更新节点的内容,它可能是指向非const的指针/引用,而不需要添加或删除任何节点(后者将涉及所有权)。
一个属于模式0类别的有趣案例是创建列表的(深度)副本;虽然这样做的函数当然必须转移它所创建的副本的所有权,但它并不关心它所复制的列表的所有权。所以可以定义如下:
list copy(const node* p)
{ return list( p==nullptr ? nullptr : new node{p->entry,copy(p->next.get())} ); }
This code merits a close look, both for the question as to why it compiles at all (the result of the recursive call to copy in the initialiser list binds to the rvalue reference argument in the move constructor of unique_ptr<node>, a.k.a. list, when initialising the next field of the generated node), and for the question as to why it is exception-safe (if during the recursive allocation process memory runs out and some call of new throws std::bad_alloc, then at that time a pointer to the partly constructed list is held anonymously in a temporary of type list created for the initialiser list, and its destructor will clean up that partial list). By the way one should resist the temptation to replace (as I initially did) the second nullptr by p, which after all is known to be null at that point: one cannot construct a smart pointer from a (raw) pointer to constant, even when it is known to be null.
模式1:按值传递智能指针
A function that takes a smart pointer value as argument takes possession of the object pointed to right away: the smart pointer that the caller held (whether in a named variable or an anonymous temporary) is copied into the argument value at function entrance and the caller's pointer has become null (in the case of a temporary the copy might have been elided, but in any case the caller has lost access to the pointed to object). I would like to call this mode call by cash: caller pays up front for the service called, and can have no illusions about ownership after the call. To make this clear, the language rules require the caller to wrap the argument in std::move if the smart pointer is held in a variable (technically, if the argument is an lvalue); in this case (but not for mode 3 below) this function does what its name suggests, namely move the value from the variable to a temporary, leaving the variable null.
For cases where the called function unconditionally takes ownership of (pilfers) the pointed-to object, this mode used with std::unique_ptr or std::auto_ptr is a good way of passing a pointer together with its ownership, which avoids any risk of memory leaks. Nonetheless I think that there are only very few situations where mode 3 below is not to be preferred (ever so slightly) over mode 1. For this reason I shall provide no usage examples of this mode. (But see the reversed example of mode 3 below, where it is remarked that mode 1 would do at least as well.) If the function takes more arguments than just this pointer, it may happen that there is in addition a technical reason to avoid mode 1 (with std::unique_ptr or std::auto_ptr): since an actual move operation takes place while passing a pointer variable p by the expression std::move(p), it cannot be assumed that p holds a useful value while evaluating the other arguments (the order of evaluation being unspecified), which could lead to subtle errors; by contrast, using mode 3 assures that no move from p takes place before the function call, so other arguments can safely access a value through p.
When used with std::shared_ptr, this mode is interesting in that with a single function definition it allows the caller to choose whether to keep a sharing copy of the pointer for itself while creating a new sharing copy to be used by the function (this happens when an lvalue argument is provided; the copy constructor for shared pointers used at the call increases the reference count), or to just give the function a copy of the pointer without retaining one or touching the reference count (this happens when a rvalue argument is provided, possibly an lvalue wrapped in a call of std::move). For instance
void f(std::shared_ptr<X> x) // call by shared cash
{ container.insert(std::move(x)); } // store shared pointer in container
void client()
{ std::shared_ptr<X> p = std::make_shared<X>(args);
f(p); // lvalue argument; store pointer in container but keep a copy
f(std::make_shared<X>(args)); // prvalue argument; fresh pointer is just stored away
f(std::move(p)); // xvalue argument; p is transferred to container and left null
}
同样可以通过分别定义void f(const std::shared_ptr<X>& X)(用于左值情况)和void f(std::shared_ptr<X>&& X)(用于右值情况)来实现,函数体的区别仅在于第一个版本调用复制语义(在使用X时使用复制构造/赋值),而第二个版本则调用移动语义(如示例代码所示,写入std::move(X))。因此,对于共享指针,模式1可以避免一些代码重复。
模式2:通过(可修改的)左值引用传递智能指针
Here the function just requires having a modifiable reference to the smart pointer, but gives no indication of what it will do with it. I would like to call this method call by card: caller ensures payment by giving a credit card number. The reference can be used to take ownership of the pointed-to object, but it does not have to. This mode requires providing a modifiable lvalue argument, corresponding to the fact that the desired effect of the function may include leaving a useful value in the argument variable. A caller with an rvalue expression that it wishes to pass to such a function would be forced to store it in a named variable to be able to make the call, since the language only provides implicit conversion to a constant lvalue reference (referring to a temporary) from an rvalue. (Unlike the opposite situation handled by std::move, a cast from Y&& to Y&, with Y the smart pointer type, is not possible; nonetheless this conversion could be obtained by a simple template function if really desired; see https://stackoverflow.com/a/24868376/1436796). For the case where the called function intends to unconditionally take ownership of the object, stealing from the argument, the obligation to provide an lvalue argument is giving the wrong signal: the variable will have no useful value after the call. Therefore mode 3, which gives identical possibilities inside our function but asks callers to provide an rvalue, should be preferred for such usage.
然而,模式2有一个有效的用例,即函数可以修改指针,或以涉及所有权的方式修改所指向的对象。例如,将节点作为列表前缀的函数提供了这样的用法:
void prepend (int x, list& l) { l = list( new node{ x, std::move(l)} ); }
显然,在这里强制调用者使用std::move是不可取的,因为他们的智能指针在调用之后仍然拥有一个定义良好的非空列表,尽管与之前不同。
Again it is interesting to observe what happens if the prepend call fails for lack of free memory. Then the new call will throw std::bad_alloc; at this point in time, since no node could be allocated, it is certain that the passed rvalue reference (mode 3) from std::move(l) cannot yet have been pilfered, as that would be done to construct the next field of the node that failed to be allocated. So the original smart pointer l still holds the original list when the error is thrown; that list will either be properly destroyed by the smart pointer destructor, or in case l should survive thanks to a sufficiently early catch clause, it will still hold the original list.
这是一个有建设性的例子;对于这个问题,我们还可以给出一个更具破坏性的例子,即删除包含给定值的第一个节点(如果有的话):
void remove_first(int x, list& l)
{ list* p = &l;
while ((*p).get()!=nullptr and (*p)->entry!=x)
p = &(*p)->next;
if ((*p).get()!=nullptr)
(*p).reset((*p)->next.release()); // or equivalent: *p = std::move((*p)->next);
}
Again the correctness is quite subtle here. Notably, in the final statement the pointer (*p)->next held inside the node to be removed is unlinked (by release, which returns the pointer but makes the original null) before reset (implicitly) destroys that node (when it destroys the old value held by p), ensuring that one and only one node is destroyed at that time. (In the alternative form mentioned in the comment, this timing would be left to the internals of the implementation of the move-assignment operator of the std::unique_ptr instance list; the standard says 20.7.1.2.3;2 that this operator should act "as if by calling reset(u.release())", whence the timing should be safe here too.)
请注意,对于存储一个总是非空列表的本地节点变量的客户端,不能调用prepend和remove_first,这是正确的,因为给出的实现不能在这种情况下工作。
模式3:通过(可修改的)右值引用传递智能指针
This is the preferred mode to use when simply taking ownership of the pointer. I would like to call this method call by check: caller must accept relinquishing ownership, as if providing cash, by signing the check, but the actual withdrawal is postponed until the called function actually pilfers the pointer (exactly as it would when using mode 2). The "signing of the check" concretely means callers have to wrap an argument in std::move (as in mode 1) if it is an lvalue (if it is an rvalue, the "giving up ownership" part is obvious and requires no separate code).
Note that technically mode 3 behaves exactly as mode 2, so the called function does not have to assume ownership; however I would insist that if there is any uncertainty about ownership transfer (in normal usage), mode 2 should be preferred to mode 3, so that using mode 3 is implicitly a signal to callers that they are giving up ownership. One might retort that only mode 1 argument passing really signals forced loss of ownership to callers. But if a client has any doubts about intentions of the called function, she is supposed to know the specifications of the function being called, which should remove any doubt.
要找到一个使用模式3参数传递的列表类型的典型例子是非常困难的。将一个列表b移动到另一个列表a的末尾是一个典型的例子;但是,使用模式2更好地传递a(保存操作的结果):
void append (list& a, list&& b)
{ list* p=&a;
while ((*p).get()!=nullptr) // find end of list a
p=&(*p)->next;
*p = std::move(b); // attach b; the variable b relinquishes ownership here
}
模式3参数传递的一个纯示例如下,它接受一个列表(及其所有权),并返回一个以相反顺序包含相同节点的列表。
list reversed (list&& l) noexcept // pilfering reversal of list
{ list p(l.release()); // move list into temporary for traversal
list result(nullptr);
while (p.get()!=nullptr)
{ // permute: result --> p->next --> p --> (cycle to result)
result.swap(p->next);
result.swap(p);
}
return result;
}
此函数的调用方法如下:l = reversed(std::move(l));将列表反转为列表本身,但反转的列表也可以以不同的方式使用。
Here the argument is immediately moved to a local variable for efficiency (one could have used the parameter l directly in the place of p, but then accessing it each time would involve an extra level of indirection); hence the difference with mode 1 argument passing is minimal. In fact using that mode, the argument could have served directly as local variable, thus avoiding that initial move; this is just an instance of the general principle that if an argument passed by reference only serves to initialise a local variable, one might just as well pass it by value instead and use the parameter as local variable.
Using mode 3 appears to be advocated by the standard, as witnessed by the fact that all provided library functions that transfer ownership of smart pointers using mode 3. A particular convincing case in point is the constructor std::shared_ptr<T>(auto_ptr<T>&& p). That constructor used (in std::tr1) to take a modifiable lvalue reference (just like the auto_ptr<T>& copy constructor), and could therefore be called with an auto_ptr<T> lvalue p as in std::shared_ptr<T> q(p), after which p has been reset to null. Due to the change from mode 2 to 3 in argument passing, this old code must now be rewritten to std::shared_ptr<T> q(std::move(p)) and will then continue to work. I understand that the committee did not like the mode 2 here, but they had the option of changing to mode 1, by defining std::shared_ptr<T>(auto_ptr<T> p) instead, they could have ensured that old code works without modification, because (unlike unique-pointers) auto-pointers can be silently dereferenced to a value (the pointer object itself being reset to null in the process). Apparently the committee so much preferred advocating mode 3 over mode 1, that they chose to actively break existing code rather than to use mode 1 even for an already deprecated usage.
什么时候更喜欢模式3而不是模式1
模式1在许多情况下都是完全可用的,在假设所有权以将智能指针移动到局部变量的形式(如上面的反向示例)的情况下,它可能比模式3更受欢迎。然而,我可以看到在更一般的情况下更喜欢模式3的两个原因:
It is slightly more efficient to pass a reference than to create a temporary and nix the old pointer (handling cash is somewhat laborious); in some scenarios the pointer may be passed several times unchanged to another function before it is actually pilfered. Such passing will generally require writing std::move (unless mode 2 is used), but note that this is just a cast that does not actually do anything (in particular no dereferencing), so it has zero cost attached.
Should it be conceivable that anything throws an exception between the start of the function call and the point where it (or some contained call) actually moves the pointed-to object into another data structure (and this exception is not already caught inside the function itself), then when using mode 1, the object referred to by the smart pointer will be destroyed before a catch clause can handle the exception (because the function parameter was destructed during stack unwinding), but not so when using mode 3. The latter gives the caller has the option to recover the data of the object in such cases (by catching the exception). Note that mode 1 here does not cause a memory leak, but may lead to an unrecoverable loss of data for the program, which might be undesirable as well.
返回一个智能指针:总是按值
总结一下关于返回智能指针的内容,假定它指向一个创建供调用者使用的对象。这与将指针传递给函数是不一样的,但是为了完整起见,我想坚持在这种情况下总是按值返回(并且不要在return语句中使用std::move)。没有人希望得到一个可能刚刚被禁用的指针的引用。