我正在使用多步操作,并希望合并结果。例如:

std::vector<int> A;
std::vector<int> B;
std::vector<int> AB;

我想让AB有A的内容和B的内容按照这个顺序。做这样的事情最有效的方法是什么?


如果你的向量排序*,检查set_union从<algorithm>。

set_union(A.begin(), A.end(), B.begin(), B.end(), AB.begin());

链接中有一个更详细的例子。


AB.reserve( A.size() + B.size() ); // preallocate memory
AB.insert( AB.end(), A.begin(), A.end() );
AB.insert( AB.end(), B.begin(), B.end() );

这正是成员函数std::vector::insert的作用

std::vector<int> AB = A;
AB.insert(AB.end(), B.begin(), B.end());

这取决于你是否真的需要物理上连接两个向量或者你想为了迭代而给出连接的外观。boost::join函数

http://www.boost.org/doc/libs/1_43_0/libs/range/doc/html/range/reference/utilities/join.html

会给你这个。

std::vector<int> v0;
v0.push_back(1);
v0.push_back(2);
v0.push_back(3);

std::vector<int> v1;
v1.push_back(4);
v1.push_back(5);
v1.push_back(6);
...

BOOST_FOREACH(const int & i, boost::join(v0, v1)){
    cout << i << endl;
}

应该给你

1
2
3
4
5
6

注意boost::join不会将两个向量复制到一个新的容器中 但生成一对覆盖的跨度的迭代器(range) 这两个容器。会有一些性能开销,但也许 而不是先将所有数据复制到一个新容器中。


基于Kiril V. Lyadvinsky的回答,我做了一个新的版本。这个片段使用模板和重载。有了它,你可以写vector3 = vector1 + vector2和vector4 += vector3。希望能有所帮助。

template <typename T>
std::vector<T> operator+(const std::vector<T> &A, const std::vector<T> &B)
{
    std::vector<T> AB;
    AB.reserve(A.size() + B.size());                // preallocate memory
    AB.insert(AB.end(), A.begin(), A.end());        // add A;
    AB.insert(AB.end(), B.begin(), B.end());        // add B;
    return AB;
}

template <typename T>
std::vector<T> &operator+=(std::vector<T> &A, const std::vector<T> &B)
{
    A.reserve(A.size() + B.size());                // preallocate memory without erase original data
    A.insert(A.end(), B.begin(), B.end());         // add B;
    return A;                                        // here A could be named AB
}

所有的解决方案都是正确的,但我发现写一个函数来实现它更容易。是这样的:

template <class T1, class T2>
void ContainerInsert(T1 t1, T2 t2)
{
    t1->insert(t1->end(), t2->begin(), t2->end());
}

这样你就可以避免像这样的临时放置:

ContainerInsert(vec, GetSomeVector());

还有一个没有提到的简单变体:

copy(A.begin(),A.end(),std::back_inserter(AB));
copy(B.begin(),B.end(),std::back_inserter(AB));

并使用合并算法:

#include <algorithm> #include <vector> #include <iterator> #include <iostream> #include <sstream> #include <string> template<template<typename, typename...> class Container, class T> std::string toString(const Container<T>& v) { std::stringstream ss; std::copy(v.begin(), v.end(), std::ostream_iterator<T>(ss, "")); return ss.str(); }; int main() { std::vector<int> A(10); std::vector<int> B(5); //zero filled std::vector<int> AB(15); std::for_each(A.begin(), A.end(), [](int& f)->void { f = rand() % 100; }); std::cout << "before merge: " << toString(A) << "\n"; std::cout << "before merge: " << toString(B) << "\n"; merge(B.begin(),B.end(), begin(A), end(A), AB.begin(), [](int&,int&)->bool {}); std::cout << "after merge: " << toString(AB) << "\n"; return 1; }


在Bradgonesurfing的答案中,很多时候并不需要连接两个向量(O(n)),而是像连接(O(1))一样处理它们。如果这是您的情况,可以在不需要Boost库的情况下完成。

诀窍是创建一个向量代理:一个包装器类,它操纵对两个向量的引用,在外部被视为单个连续的向量。

使用

std::vector<int> A{ 1, 2, 3, 4, 5};
std::vector<int> B{ 10, 20, 30 };

VecProxy<int> AB(A, B);  // ----> O(1). No copies performed.

for (size_t i = 0; i < AB.size(); ++i)
    std::cout << AB[i] << " ";  // 1 2 3 4 5 10 20 30

实现

template <class T>
class VecProxy {
private:
    std::vector<T>& v1, v2;
public:
    VecProxy(std::vector<T>& ref1, std::vector<T>& ref2) : v1(ref1), v2(ref2) {}
    const T& operator[](const size_t& i) const;
    const size_t size() const;
};

template <class T>
const T& VecProxy<T>::operator[](const size_t& i) const{
    return (i < v1.size()) ? v1[i] : v2[i - v1.size()];
};

template <class T>
const size_t VecProxy<T>::size() const { return v1.size() + v2.size(); };

主要好处

它是O(1)(常数时间)来创建它,并且使用最小的额外内存分配。

一些需要考虑的问题

You should only go for it if you really know what you're doing when dealing with references. This solution is intended for the specific purpose of the question made, for which it works pretty well. To employ it in any other context may lead to unexpected behavior if you are not sure on how references work. In this example, AB does not provide a non-const access operator ([ ]). Feel free to include it, but keep in mind: since AB contains references, to assign it values will also affect the original elements within A and/or B. Whether or not this is a desirable feature, it's an application-specific question one should carefully consider. Any changes directly made to either A or B (like assigning values, sorting, etc.) will also "modify" AB. This is not necessarily bad (actually, it can be very handy: AB does never need to be explicitly updated to keep itself synchronized to both A and B), but it's certainly a behavior one must be aware of. Important exception: to resize A and/or B to sth bigger may lead these to be reallocated in memory (for the need of contiguous space), and this would in turn invalidate AB. Because every access to an element is preceded by a test (namely, "i < v1.size()"), VecProxy access time, although constant, is also a bit slower than that of vectors. This approach can be generalized to n vectors. I haven't tried, but it shouldn't be a big deal.


对于这个用例,如果您预先知道每个线程产生的结果的数量,那么您可以预先分配AB并将std::span传递给每个线程。这样就不需要进行连接。例子:

std::vector<int> AB(total_number_of_results, 0);
std::size_t chunk_length = …;
std::size_t chunk2_start = chunk_length;
std::size_t chunk3_start = 2 * chunk_length; // If needed
…
// Pass these to the worker threads.
std::span<int> A(AB.data(), chunk_length);
std::span<int> B(AB.data() + chunk2_start, chunk_length);
…

我的回答是基于罗纳德·索萨先生最初的解决方案。除了他最初的解决方案,我还写了一个支持迭代器的向量代理!

对于那些不知道原始解决方案上下文的人来说,简短的描述:joined_vector模板类(即向量代理)将两个向量的两个引用作为构造函数参数,然后将它们视为一个连续的向量。我的实现还支持前向迭代器。

用法:

int main()
{
    std::vector<int> a1;
    std::vector<int> a2;

    joined_vector<std::vector<int>> jv(a1,a2);

    for (int i = 0; i < 5; i++)
        a1.push_back(i);
    for (int i = 5; i <=10; i++)
        a2.push_back(i);

    for (auto e : jv)
        std::cout << e<<"\n";
    for (int i = 0; i < jv.size(); i++)
        std::cout << jv[i] << "\n";
    return 0;
}

实现:

template<typename _vec>
class joined_vector
{
    _vec& m_vec1;
    _vec& m_vec2;

public:

    struct Iterator
    {
        typedef typename _vec::iterator::value_type type_value;
        typedef typename _vec::iterator::value_type* pointer;
        typedef typename _vec::iterator::value_type& reference;
        typedef std::forward_iterator_tag iterator_category;
        typedef std::ptrdiff_t difference_type;
        _vec* m_vec1;
        _vec* m_vec2;
        Iterator(pointer ptr) :m_ptr(ptr)
        {

        }
        Iterator operator++()
        {
            if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
                m_ptr = &(*m_vec2)[0];
            else
                ++m_ptr;
            return m_ptr;
        }
        Iterator operator++(int)
        {
            pointer curr = m_ptr;
            if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
                m_ptr = &(*m_vec2)[0];
            else
                ++m_ptr;
            return curr;
        }
        reference operator *()
        {
            return *m_ptr;
        }
        pointer operator ->()
        {
            return m_ptr;
        }

        friend bool operator == (Iterator& itr1, Iterator& itr2)
        {
            return itr1.m_ptr == itr2.m_ptr;
        }
        friend bool operator != (Iterator& itr1, Iterator& itr2)
        {
            return itr1.m_ptr != itr2.m_ptr;
        }
    private:
        pointer m_ptr;
    };

    joined_vector(_vec& vec1, _vec& vec2) :m_vec1(vec1), m_vec2(vec2)
    {

    }
    Iterator begin()
    {
        //checkes if m_vec1 is empty and gets the first elemet's address,
        //if it's empty then it get's the first address of the second vector m_vec2
        //if both of them are empty then nullptr is returned as the first pointer
        Iterator itr_beg((m_vec1.size() != 0) ? &m_vec1[0] : ((m_vec2.size() != 0) ? &m_vec2[0] : nullptr));
        itr_beg.m_vec1 = &m_vec1;
        itr_beg.m_vec2 = &m_vec2;
        return itr_beg;
    }
    Iterator end()
    {
        //check if m_vec2 is empty and get the last address of that vector
        //if the second vector is empty then the m_vec1's vector/the first vector's last element's address is taken
        //if both of them are empty then a null pointer is returned as the end pointer
        typename _vec::value_type* p = ((m_vec2.size() != 0) ? &m_vec2[m_vec2.size() - 1] : ((m_vec1.size()) != 0 ? &m_vec1[m_vec1.size() - 1] : nullptr));
        Iterator itr_beg(p != nullptr ? p + 1 : nullptr);
        itr_beg.m_vec1 = &m_vec1;
        itr_beg.m_vec2 = &m_vec2;
        return itr_beg;
    }
    typename _vec::value_type& operator [](int i)
    {
        if (i < m_vec1.size())
            return m_vec1[i];
        else
            return m_vec2[i - m_vec1.size()];
    }
    size_t size()
    {
        return m_vec1.size() + m_vec2.size();
    }

};


template<typename T>
vector<T> CombineVectors(vector<T> &&a, vector<T> &&b) {
  vector<T> ab = move(a);
  ab.insert(
      ab.end(), make_move_iterator(b.begin()), make_move_iterator(b.end()));
  return ab;
}


template<typename T>
vector<T> CombineVectors(const vector<T> &a, const vector<T> &b) {
  vector<T> ab = a;
  ab.insert(ab.end(), b.begin(), b.end());
  return ab;
}