c++排序和跟踪索引

使用c++(最好是标准库)，我想对一个样本序列进行升序排序，但我也想记住新样本的原始索引。

例如，我有一个集合，或向量，或样本a的矩阵:[5,2,1,4,3]。我想把它们排序为B:[1,2,3,4,5]，但我也想记住这些值的原始索引，所以我可以得到另一个集合，它将是: C:[2,1,4,3,0] -这对应于'B'中每个元素的索引，在原始'A'中。

例如，在Matlab中，你可以这样做:

 [a,b]=sort([5, 8, 7])
 a = 5 7 8
 b = 1 3 2

有谁能想到一个好办法吗?

当前回答

我遇到了这个问题，并发现直接对迭代器排序是一种对值排序并跟踪下标的方法;不需要定义一个额外的(value, index)对容器，这在值是大对象时很有用;迭代器提供了对值和索引的访问:

/*
 * a function object that allows to compare
 * the iterators by the value they point to
 */
template < class RAIter, class Compare >
class IterSortComp
{
    public:
        IterSortComp ( Compare comp ): m_comp ( comp ) { }
        inline bool operator( ) ( const RAIter & i, const RAIter & j ) const
        {
            return m_comp ( * i, * j );
        }
    private:
        const Compare m_comp;
};

template <class INIter, class RAIter, class Compare>
void itersort ( INIter first, INIter last, std::vector < RAIter > & idx, Compare comp )
{ 
    idx.resize ( std::distance ( first, last ) );
    for ( typename std::vector < RAIter >::iterator j = idx.begin( ); first != last; ++ j, ++ first )
        * j = first;

    std::sort ( idx.begin( ), idx.end( ), IterSortComp< RAIter, Compare > ( comp ) );
}

关于使用示例:

std::vector < int > A ( n );

// populate A with some random values
std::generate ( A.begin( ), A.end( ), rand );

std::vector < std::vector < int >::const_iterator > idx;
itersort ( A.begin( ), A.end( ), idx, std::less < int > ( ) );

现在，例如，排序向量中第5小的元素的值为**idx[5]，它在原始向量中的下标为distance(A.begin()， *idx[5])或简单地称为*idx[5] - A.begin()。

2013-09-08 12:26:50

其他回答

向量中的项是唯一的吗?如果是这样，复制向量，排序一个副本与STL排序，然后你可以找到每个项目在原始向量的索引。

如果向量应该处理重复的项，我认为你最好实现自己的排序例程。

2009-10-16 11:35:06

我写了索引排序的通用版本。

template <class RAIter, class Compare>
void argsort(RAIter iterBegin, RAIter iterEnd, Compare comp, 
    std::vector<size_t>& indexes) {

    std::vector< std::pair<size_t,RAIter> > pv ;
    pv.reserve(iterEnd - iterBegin) ;

    RAIter iter ;
    size_t k ;
    for (iter = iterBegin, k = 0 ; iter != iterEnd ; iter++, k++) {
        pv.push_back( std::pair<int,RAIter>(k,iter) ) ;
    }

    std::sort(pv.begin(), pv.end(), 
        [&comp](const std::pair<size_t,RAIter>& a, const std::pair<size_t,RAIter>& b) -> bool 
        { return comp(*a.second, *b.second) ; }) ;

    indexes.resize(pv.size()) ;
    std::transform(pv.begin(), pv.end(), indexes.begin(), 
        [](const std::pair<size_t,RAIter>& a) -> size_t { return a.first ; }) ;
}

使用方法与std::sort相同，除了一个索引容器接收排序的索引。测试:

int a[] = { 3, 1, 0, 4 } ;
std::vector<size_t> indexes ;
argsort(a, a + sizeof(a) / sizeof(a[0]), std::less<int>(), indexes) ;
for (size_t i : indexes) printf("%d\n", int(i)) ;

你应该得到2 10 0 3。对于不支持c++0x的编译器，将lamba表达式替换为类模板:

template <class RAIter, class Compare> 
class PairComp {
public:
  Compare comp ;
  PairComp(Compare comp_) : comp(comp_) {}
  bool operator() (const std::pair<size_t,RAIter>& a, 
    const std::pair<size_t,RAIter>& b) const { return comp(*a.second, *b.second) ; }        
} ;

然后重写std::sort as

std::sort(pv.begin(), pv.end(), PairComp(comp)()) ;

2012-07-30 04:21:11

如果可能的话，可以使用find函数构建位置数组，然后对数组排序。

或者你可以使用一个映射，其中键是元素，值是它在即将到来的数组(a, B和C)中的位置列表

这取决于以后对这些数组的使用。

2009-10-16 11:22:18

使用c++ 11 lambdas:

#include <iostream>
#include <vector>
#include <numeric>      // std::iota
#include <algorithm>    // std::sort, std::stable_sort

using namespace std;

template <typename T>
vector<size_t> sort_indexes(const vector<T> &v) {

  // initialize original index locations
  vector<size_t> idx(v.size());
  iota(idx.begin(), idx.end(), 0);

  // sort indexes based on comparing values in v
  // using std::stable_sort instead of std::sort
  // to avoid unnecessary index re-orderings
  // when v contains elements of equal values 
  stable_sort(idx.begin(), idx.end(),
       [&v](size_t i1, size_t i2) {return v[i1] < v[i2];});

  return idx;
}

现在您可以在迭代中使用返回的索引向量，例如

for (auto i: sort_indexes(v)) {
  cout << v[i] << endl;
}

您还可以选择提供原始索引向量、排序函数、比较器，或者使用额外的向量在sort_indexes函数中自动重新排序v。

2012-09-13 04:10:41

我的解法使用了余数法。我们可以把需要排序的值放在上面2个字节，而把元素的下标放在下面2个字节:

int myints[] = {32,71,12,45,26,80,53,33};

for (int i = 0; i < 8; i++)
   myints[i] = myints[i]*(1 << 16) + i;

然后像往常一样对数组myint进行排序:

std::vector<int> myvector(myints, myints+8);
sort(myvector.begin(), myvector.begin()+8, std::less<int>());

在此之后，您可以通过渣滓访问元素的指数。下面的代码输出按升序排序的值的索引:

for (std::vector<int>::iterator it = myvector.begin(); it != myvector.end(); ++it)
   std::cout << ' ' << (*it)%(1 << 16);

当然，这种技术只适用于原始数组myint中相对较小的值(即可以装入int的前2个字节的值)。但是它还有一个额外的好处，可以区分相同的myint值:它们的下标将按正确的顺序打印。

2016-02-05 15:43:31

c++排序和跟踪索引

推荐文章

最新文章

标签