什么是最有效的方法来删除重复和排序一个向量?

我需要采取一个c++向量与潜在的很多元素，擦除重复，并排序。

我目前有下面的代码，但它不起作用。

vec.erase(
      std::unique(vec.begin(), vec.end()),
      vec.end());
std::sort(vec.begin(), vec.end());

我怎样才能正确地做到这一点呢?

此外，是先删除副本(类似于上面的编码)还是先执行排序更快?如果我先执行排序，它是否保证在std::unique执行后保持排序?

或者是否有另一种(也许更有效的)方法来完成这一切?

当前回答

假设a是一个向量，使用

a.erase(独特(a.begin (), a.end ()), a.end ());运行时间为O(n)。

2019-05-06 12:14:24

其他回答

Unique只删除连续的重复元素(这是它在线性时间内运行所必需的)，因此应该首先执行排序。调用unique后，它将保持排序。

2009-06-25 00:33:03

std::set<int> s;
std::for_each(v.cbegin(), v.cend(), [&s](int val){s.insert(val);});
v.clear();
std::copy(s.cbegin(), s.cend(), v.cbegin());

2013-11-11 07:13:13

如果您正在寻找性能并使用std::vector，我推荐使用本文档链接提供的方法。

std::vector<int> myvector{10,20,20,20,30,30,20,20,10};             // 10 20 20 20 30 30 20 20 10
std::sort(myvector.begin(), myvector.end() );
const auto& it = std::unique (myvector.begin(), myvector.end());   // 10 20 30 ?  ?  ?  ?  ?  ?
                                                                   //          ^
myvector.resize( std::distance(myvector.begin(),it) ); // 10 20 30

2017-12-15 21:36:55

如果你的类很容易转换为int型，并且你有一些内存， Unique可以在没有排序的情况下完成，而且速度快得多:

#include <vector>
#include <stdlib.h>
#include <algorithm>
int main (int argc, char* argv []) {
  //vector init
  std::vector<int> v (1000000, 0);
  std::for_each (v.begin (), v.end (), [] (int& s) {s = rand () %1000;});
  std::vector<int> v1 (v);
  int beg (0), end (0), duration (0);
  beg = clock ();
  {
    std::sort (v.begin (), v.end ());
    auto i (v.begin ());
    i = std::unique (v.begin (), v.end ());
    if (i != v.end ()) v.erase (i, v.end ());
  }
  end = clock ();
  duration = (int) (end - beg);
  std::cout << "\tduration sort + unique == " << duration << std::endl;

  int n (0);
  duration = 0;
  beg = clock ();
  std::for_each (v1.begin (), v1.end (), [&n] (const int& s) {if (s >= n) n = s+1;});
  std::vector<int> tab (n, 0);
  {
    auto i (v1.begin ());
    std::for_each (v1.begin (), v1.end (), [&i, &tab] (const int& s) {
      if (!tab [s]) {
        *i++ = s;
        ++tab [s];
      }
    });
    std::sort (v1.begin (), i);
    v1.erase (i, v1.end ());
  }
  end = clock ();
  duration = (int) (end - beg);
  std::cout << "\tduration unique + sort == " << duration << std::endl;
  if (v == v1) {
    std::cout << "and results are same" << std::endl;
  }
  else {
    std::cout << "but result differs" << std::endl;
  }  
}

典型结果: Duration sort + unique == 38985 持续时间唯一+排序== 2500 结果是一样的

2021-04-23 20:19:56

使用Ranges v3库，您可以简单地使用

action::unique(vec);

注意，它实际上删除了重复的元素，而不仅仅是移动它们。

不幸的是，动作在c++ 20中没有标准化，因为即使在c++ 20中，范围库的其他部分仍然必须使用原始库。

2019-07-10 00:11:54

什么是最有效的方法来删除重复和排序一个向量?

推荐文章

最新文章

标签