这是我第一次使用映射,我意识到插入元素的方法有很多种。您可以使用emplace(),操作符[]或insert(),加上使用value_type或make_pair等变量。虽然有很多关于他们所有人的信息和关于特定案例的问题,但我仍然不能理解大局。 我的两个问题是:

它们各自的优势是什么? 是否有必要在标准中增加放置位置?在没有它之前,有什么事情是不可能的吗?


当前回答

还有一个没有在其他答案中讨论的附加问题,它适用于std::map以及std::unordered_map, std::set和std::unordered_set:

Insert使用键对象,这意味着如果键已经存在于容器中,则不需要分配节点。 Emplace需要首先构造键,这通常需要在每次调用它时分配一个节点。

从这个角度来看,如果键已经存在于容器中,那么放置可能比插入效率低。(这对于多线程应用程序来说可能很重要,例如在线程本地字典中,需要同步分配。)

Live demo: https://godbolt.org/z/ornYcTqW9. Note that with libstdc++, emplace allocates 10 times, while insert only once. With libc++, there is only one allocation with emplace as well; it seems there is some optimization that copies/moves keys*. I got the same outcome with Microsoft STL, so it actually seems that there is some missing optimization in libstdc++. However, the whole problem may not be related only to standard containers. For instance, concurrent_unordered_map from Intel/oneAPI TBB behaves the same as libstdc++ in this regard.


*注意,此优化不能应用于密钥既不可复制又不可移动的情况。在这个现场演示中,即使使用了emplace和libc++: https://godbolt.org/z/1b6b331qf,我们也有10个分配。(当然,对于不可复制和不可移动的键,我们不能使用insert或try_emplace,因此没有其他选项。)

其他回答

除了优化机会和更简单的语法之外,插入和插入之间的一个重要区别是,后者允许显式转换。(这适用于整个标准库,而不仅仅是地图。)

下面是一个例子:

#include <vector>

struct foo
{
    explicit foo(int);
};

int main()
{
    std::vector<foo> v;

    v.emplace(v.end(), 10);      // Works
    //v.insert(v.end(), 10);     // Error, not explicit
    v.insert(v.end(), foo(10));  // Also works
}

诚然,这是一个非常具体的细节,但是当您处理用户定义的转换链时,有必要记住这一点。

假设你正在将Foo对象添加到set<Foo>对象中,Foo(int)是一个构造函数,并且Foo具有复制/移动构造函数。那么主要的“大局”区别在于:

emplace(0) - which calls set::emplace(int && my_args) - forwards the given arguments (i.e. the int 0) to a Foo constructor somewhere in the definition of the set::emplace method (e.g. there is a call like Foo(0) somewhere in that method's code). Importantly: NO Foo copy or move constructor is called. insert(0) - which calls set::insert(Foo && value) - first (1) creates the Foo object Foo(0) (because 0 has type int but insert needs an object of type Foo as input) that is used as the method's argument (i.e. as value). And then (2) this Foo object (in value) is used as an argument to a Foo (copy or move) constructor somewhere in the definition of the set::insert method.

下面的代码通过跟踪每个构造函数调用并在它们发生时告诉您有关它们的信息,展示了insert()与emplace()的不同之处的“大局思想”。将输出与代码进行比较可以明确insert()和emplace()之间的区别。

代码很简单,但有点长,所以为了节省时间,我建议您先阅读摘要,然后快速浏览一下代码(这应该足以理解代码及其输出)。

代码摘要:

The Foo class: uses static int foo_counter to keep track of the total number of Foo objects that have been constructed (or moved, copied, etc.) thus far. Each Foo object stores the (unique) value of foo_counter at the time of its creation in its local variable val. The unique object with val == 8 is called "foo8" or "Foo 8" (ditto for 1, 2, etc.). Every constructor/destructor call prints info about the call (e.g. calling Foo(11) will output "Foo(int) with val: 11"). main()'s body: insert()s and emplace()s Foo objects into an unordered_map<Foo,int> object umap with calls such as "umap.emplace(Foo(11), 0);" and "umap.insert({12, 0})" (0 is just some arbitrary int, it can be any value). Every line of code is printed to cout before it is executed.

Code

#include <iostream>
#include <unordered_map>
#include <utility>
using namespace std;

//Foo simply outputs what constructor is called with what value.
struct Foo {
  static int foo_counter; //Track how many Foo objects have been created.
  int val; //This Foo object was the val-th Foo object to be created.
  Foo() { val = foo_counter++;
    cout << "Foo() with val:                " << val << '\n';
  }
  Foo(int value) : val(value) { foo_counter++;
    cout << "Foo(int) with val:             " << val << '\n';
  }
  Foo(Foo& f2) { val = foo_counter++;
    cout << "Foo(Foo &) with val:           " << val
         << " \tcreated from:      \t" << f2.val << '\n';
  }
  Foo(const Foo& f2) { val = foo_counter++;
    cout << "Foo(const Foo &) with val:     " << val
         << " \tcreated from:      \t" << f2.val << '\n';
  }
  Foo(Foo&& f2) { val = foo_counter++;
    cout << "Foo(Foo&&) moving:             " << f2.val
         << " \tand changing it to:\t" << val << '\n';
  }
  ~Foo() { cout << "~Foo() destroying:             " << val << '\n'; }
  Foo& operator=(const Foo& rhs) {
    cout << "Foo& operator=(const Foo& rhs) with rhs.val: " << rhs.val
         << " \tcalled with lhs.val = \t" << val
         << " \tChanging lhs.val to: \t" << rhs.val << '\n';
    val = rhs.val; return *this;
  }
  bool operator==(const Foo &rhs) const { return val == rhs.val; }
  bool operator<(const Foo &rhs)  const { return val <  rhs.val; }
};
int Foo::foo_counter = 0;

//Create a hash function for Foo in order to use Foo with unordered_map
template<> struct std::hash<Foo> {
   size_t operator()(const Foo &f) const { return hash<int>{}(f.val); }
};

int main() {
    unordered_map<Foo, int> umap;
    int d; //Some int that will be umap's value. It is not important.

    //Print the statement to be executed and then execute it.

    cout << "\nFoo foo0, foo1, foo2, foo3;\n";
    Foo foo0, foo1, foo2, foo3;

    cout << "\numap.insert(pair<Foo, int>(foo0, d))\n";
    umap.insert(pair<Foo, int>(foo0, d));
    //Side note: equivalent to: umap.insert(make_pair(foo0, d));

    cout << "\numap.insert(move(pair<Foo, int>(foo1, d)))\n";
    umap.insert(move(pair<Foo, int>(foo1, d)));
    //Side note: equiv. to: umap.insert(make_pair(foo1, d));
    
    cout << "\npair<Foo, int> pair(foo2, d)\n";
    pair<Foo, int> pair(foo2, d);

    cout << "\numap.insert(pair)\n";
    umap.insert(pair);

    cout << "\numap.emplace(foo3, d)\n";
    umap.emplace(foo3, d);
    
    cout << "\numap.emplace(11, d)\n";
    umap.emplace(11, d);

    cout << "\numap.insert({12, d})\n";
    umap.insert({12, d});

    cout.flush();
}

输出

Foo foo0, foo1, foo2, foo3;
Foo() with val:                0
Foo() with val:                1
Foo() with val:                2
Foo() with val:                3

umap.insert(pair<Foo, int>(foo0, d))
Foo(Foo &) with val:           4    created from:       0
Foo(Foo&&) moving:             4    and changing it to: 5
~Foo() destroying:             4

umap.insert(move(pair<Foo, int>(foo1, d)))
Foo(Foo &) with val:           6    created from:       1
Foo(Foo&&) moving:             6    and changing it to: 7
~Foo() destroying:             6

pair<Foo, int> pair(foo2, d)
Foo(Foo &) with val:           8    created from:       2

umap.insert(pair)
Foo(const Foo &) with val:     9    created from:       8

umap.emplace(foo3, d)
Foo(Foo &) with val:           10   created from:       3

umap.emplace(11, d)
Foo(int) with val:             11

umap.insert({12, d})
Foo(int) with val:             12
Foo(const Foo &) with val:     13   created from:       12
~Foo() destroying:             12

~Foo() destroying:             8
~Foo() destroying:             3
~Foo() destroying:             2
~Foo() destroying:             1
~Foo() destroying:             0
~Foo() destroying:             13
~Foo() destroying:             11
~Foo() destroying:             5
~Foo() destroying:             10
~Foo() destroying:             7
~Foo() destroying:             9

大局

insert()和emplace()的主要区别是:

Whereas using insert() almost† always requires the construction or pre-existence of some Foo object in main()'s scope (followed by a copy or move), if using emplace() then any call to a Foo constructor is done entirely internally in the unordered_map (i.e. inside the scope of the emplace() method's definition). The argument(s) for the key that you pass to emplace() are directly forwarded to a Foo constructor call within unordered_map::emplace()'s definition (optional additional details: where this newly constructed object is immediately incorporated into one of unordered_map's member variables so that no destructor is called when execution leaves emplace() and no move or copy constructors are called).

†上面的“almost always”中出现“almost”的原因是因为insert()的一次重载实际上相当于emplace()。如cppreference.com页面所述,重载模板<class P> pair<iterator, bool> insert(p&&value)(该页上insert()的重载(2))等效于emplace(forward<P>(value))。由于我们对差异感兴趣,我将忽略这个过载,不再提及这个特殊的技术细节。

逐步执行代码

现在我将详细介绍代码及其输出。

首先,请注意,unordered_map总是在内部将Foo对象(而不是Foo *s)存储为键,当unordered_map被销毁时,这些键都将被销毁。在这里,unordered_map的内部键是foos 13、11、5、10、7和9。

So technically, our unordered_map actually stores pair<const Foo, int> objects, which in turn store the Foo objects. But to understand the "big picture idea" of how emplace() differs from insert() (see highlighted box above), it's okay to temporarily imagine this pair object as being entirely passive. Once you understand this "big picture idea," it's important to then back up and understand how the use of this intermediary pair object by unordered_map introduces subtle, but important, technicalities.

insert()ing each of foo0, foo1, and foo2 required 2 calls to one of Foo's copy/move constructors and 2 calls to Foo's destructor (as I now describe): insert()ing each of foo0 and foo1 created a temporary object (foo4 and foo6, respectively) whose destructor was then immediately called after the insertion completed. In addition, the unordered_map's internal Foos (which are foos 5 and 7) also had their destructors called when the unordered_map was destroyed once execution reached the end of main(). To insert() foo2, we instead first explicitly created a non-temporary pair object (called pair), which called Foo's copy constructor on foo2 (creating foo8 as an internal member of pair). We then insert()ed this pair, which resulted in unordered_map calling the copy constructor again (on foo8) to create its own internal copy (foo9). As with foos 0 and 1, the end result was two destructor calls for this insert()ion with the only difference being that foo8's destructor was called only when we reached the end of main() rather than being called immediately after insert() finished. emplace()ing foo3 resulted in only 1 copy/move constructor call (creating foo10 internally in the unordered_map ) and only 1 call to Foo's destructor. The reason why calling umap.emplace(foo3, d) called Foo's non-const copy constructor is the following: Since we're using emplace(), the compiler knows that foo3 (a non-const Foo object) is meant to be an argument to some Foo constructor. In this case, the most fitting Foo constructor is the non-const copy constructor Foo(Foo& f2). This is why umap.emplace(foo3, d) called a copy constructor while umap.emplace(11, d) did not. For foo11, we directly passed the integer 11 to emplace(11, d) so that unordered_map would call the Foo(int) constructor while execution is within its emplace() method. Unlike in (2) and (3), we didn't even need some pre-exiting foo object to do this. Importantly, notice that only 1 call to a Foo constructor occurred (which created foo11). We then directly passed the integer 12 to insert({12, d}). Unlike with emplace(11, d) (which recall resulted in only 1 call to a Foo constructor), this call to insert({12, d}) resulted in two calls to Foo's constructor (creating foo12 and foo13).


后记

从这里往哪里走?

a.玩转上面的源代码,并研究网上找到的insert()(例如这里)和emplace()(例如这里)的文档。如果您正在使用eclipse或NetBeans等IDE,那么您可以很容易地让IDE告诉您正在调用insert()或emplace()的哪个重载(在eclipse中,只需将鼠标光标稳定地停留在函数调用上一秒钟)。下面是一些可以尝试的代码:

cout << "\numap.insert({{" << Foo::foo_counter << ", d}})\n";
umap.insert({{Foo::foo_counter, d}});
//but umap.emplace({{Foo::foo_counter, d}}); results in a compile error!

cout << "\numap.insert(pair<const Foo, int>({" << Foo::foo_counter << ", d}))\n";
umap.insert(pair<const Foo, int>({Foo::foo_counter, d}));
//The above uses Foo(int) and then Foo(const Foo &), as expected. but the
// below call uses Foo(int) and the move constructor Foo(Foo&&). 
//Do you see why?
cout << "\numap.insert(pair<Foo, int>({" << Foo::foo_counter << ", d}))\n";
umap.insert(pair<Foo, int>({Foo::foo_counter, d}));
//Not only that, but even more interesting is how the call below uses all 
// three of Foo(int) and the Foo(Foo&&) move and Foo(const Foo &) copy 
// constructors, despite the below call's only difference from the call above 
// being the additional { }.
cout << "\numap.insert({pair<Foo, int>({" << Foo::foo_counter << ", d})})\n";
umap.insert({pair<Foo, int>({Foo::foo_counter, d})});


//Pay close attention to the subtle difference in the effects of the next 
// two calls.
int cur_foo_counter = Foo::foo_counter;
cout << "\numap.insert({{cur_foo_counter, d}, {cur_foo_counter+1, d}}) where " 
  << "cur_foo_counter = " << cur_foo_counter << "\n";
umap.insert({{cur_foo_counter, d}, {cur_foo_counter+1, d}});

cout << "\numap.insert({{Foo::foo_counter, d}, {Foo::foo_counter+1, d}}) where "
  << "Foo::foo_counter = " << Foo::foo_counter << "\n";
umap.insert({{Foo::foo_counter, d}, {Foo::foo_counter+1, d}});


//umap.insert(initializer_list<pair<Foo, int>>({{Foo::foo_counter, d}}));
//The call below works fine, but the commented out line above gives a 
// compiler error. It's instructive to find out why. The two calls
// differ by a "const".
cout << "\numap.insert(initializer_list<pair<const Foo, int>>({{" << Foo::foo_counter << ", d}}))\n";
umap.insert(initializer_list<pair<const Foo, int>>({{Foo::foo_counter, d}}));

您很快就会看到unordered_map最终使用了对构造函数的哪个重载(参见参考资料),这对复制、移动、创建和/或销毁对象的数量以及发生这些情况的时间有重要影响。

b.看看当你使用其他容器类(例如set或unordered_multiset)而不是unordered_map时会发生什么。

c.现在使用Goo对象(只是一个重命名的Foo副本)而不是int作为unordered_map中的范围类型(即使用unordered_map<Foo, Goo>而不是unordered_map<Foo, int>),并查看调用了多少Goo构造函数以及调用了哪些Goo构造函数。(剧透:确实有影响,但不是很戏剧化。)

还有一个没有在其他答案中讨论的附加问题,它适用于std::map以及std::unordered_map, std::set和std::unordered_set:

Insert使用键对象,这意味着如果键已经存在于容器中,则不需要分配节点。 Emplace需要首先构造键,这通常需要在每次调用它时分配一个节点。

从这个角度来看,如果键已经存在于容器中,那么放置可能比插入效率低。(这对于多线程应用程序来说可能很重要,例如在线程本地字典中,需要同步分配。)

Live demo: https://godbolt.org/z/ornYcTqW9. Note that with libstdc++, emplace allocates 10 times, while insert only once. With libc++, there is only one allocation with emplace as well; it seems there is some optimization that copies/moves keys*. I got the same outcome with Microsoft STL, so it actually seems that there is some missing optimization in libstdc++. However, the whole problem may not be related only to standard containers. For instance, concurrent_unordered_map from Intel/oneAPI TBB behaves the same as libstdc++ in this regard.


*注意,此优化不能应用于密钥既不可复制又不可移动的情况。在这个现场演示中,即使使用了emplace和libc++: https://godbolt.org/z/1b6b331qf,我们也有10个分配。(当然,对于不可复制和不可移动的键,我们不能使用insert或try_emplace,因此没有其他选项。)

就功能或输出而言,它们都是相同的。

对于这两个大内存,对象放置是内存优化,不使用复制构造函数

简单详细的解释 https://medium.com/@sandywits/all-about-emplace-in-c-71fd15e06e44

放置:利用右值引用来使用您已经创建的实际对象。这意味着没有复制或移动构造函数被调用,这对大型对象很好!O (log (N))的时间。

Insert:具有标准左值引用和右值引用的重载,以及指向要插入元素列表的迭代器,以及关于元素所属位置的“提示”。使用“hint”迭代器可以将插入所花费的时间降低到常数时间,否则就是O(log(N))时间。

操作符[]:检查对象是否存在,如果存在,则修改对该对象的引用,否则使用提供的键和值对这两个对象调用make_pair,然后执行与insert函数相同的工作。这是O(log(N))次。

make_pair:只做一对。

There was no "need" for adding emplace to the standard. In c++11 I believe the && type of reference was added. This removed the necessity for move semantics, and allowed optimization of some specific type of memory management. In particular, the rvalue reference. The overloaded insert(value_type &&) operator does not take advantage of the in_place semantics, and is therefore much less efficient. While it provides the capability of dealing with rvalue references, it ignores their key purpose, which is in place construction of objects.