算法构建块
我们从组装标准库的算法构建块开始:
#include <algorithm> // min_element, iter_swap,
// upper_bound, rotate,
// partition,
// inplace_merge,
// make_heap, sort_heap, push_heap, pop_heap,
// is_heap, is_sorted
#include <cassert> // assert
#include <functional> // less
#include <iterator> // distance, begin, end, next
the iterator tools such as non-member std::begin() / std::end() as well as with std::next() are only available as of C++11 and beyond. For C++98, one needs to write these himself. There are substitutes from Boost.Range in boost::begin() / boost::end(), and from Boost.Utility in boost::next().
the std::is_sorted algorithm is only available for C++11 and beyond. For C++98, this can be implemented in terms of std::adjacent_find and a hand-written function object. Boost.Algorithm also provides a boost::algorithm::is_sorted as a substitute.
the std::is_heap algorithm is only available for C++11 and beyond.
语法好东西
c++ 14提供了形式为std::less<>的透明比较器,它们对参数起多态作用。这避免了必须提供迭代器的类型。这可以与c++ 11的默认函数模板参数结合使用,为以<作为比较的排序算法和具有用户定义比较函数对象的排序算法创建单个重载。
template<class It, class Compare = std::less<>>
void xxx_sort(It first, It last, Compare cmp = Compare{});
在c++ 11中,可以定义一个可重用的模板别名来提取迭代器的值类型,这给排序算法的签名增加了一些小混乱:
template<class It>
using value_type_t = typename std::iterator_traits<It>::value_type;
template<class It, class Compare = std::less<value_type_t<It>>>
void xxx_sort(It first, It last, Compare cmp = Compare{});
在c++ 98中,需要编写两个重载,并使用详细的typename xxx<yyy>::type语法
template<class It, class Compare>
void xxx_sort(It first, It last, Compare cmp); // general implementation
template<class It>
void xxx_sort(It first, It last)
{
xxx_sort(first, last, std::less<typename std::iterator_traits<It>::value_type>());
}
Another syntactical nicety is that C++14 facilitates wrapping user-defined comparators through polymorphic lambdas (with auto parameters that are deduced like function template arguments).
C++11 only has monomorphic lambdas, that require the use of the above template alias value_type_t.
In C++98, one either needs to write a standalone function object or resort to the verbose std::bind1st / std::bind2nd / std::not1 type of syntax.
Boost.Bind improves this with boost::bind and _1 / _2 placeholder syntax.
C++11 and beyond also have std::find_if_not, whereas C++98 needs std::find_if with a std::not1 around a function object.
c++风格
目前还没有普遍接受的c++ 14风格。不管是好是坏,我一直在密切关注Scott Meyers的《Effective Modern c++》草案和Herb Sutter的《GotW》修订版。我推荐以下风格:
Herb Sutter's "Almost Always Auto" and Scott Meyers's "Prefer auto to specific type declarations" recommendation, for which the brevity is unsurpassed, although its clarity is sometimes disputed.
Scott Meyers's "Distinguish () and {} when creating objects" and consistently choose braced-initialization {} instead of the good old parenthesized initialization () (in order to side-step all most-vexing-parse issues in generic code).
Scott Meyers's "Prefer alias declarations to typedefs". For templates this is a must anyway, and using it everywhere instead of typedef saves time and adds consistency.
I use a for (auto it = first; it != last; ++it) pattern in some places, in order to allow for loop invariant checking for already sorted sub-ranges. In production code, the use of while (first != last) and a ++first somewhere inside the loop might be slightly better.
选择排序
选择排序不以任何方式适应数据,因此它的运行时间总是O(N²)。但是,选择排序具有使交换次数最小化的特性。在交换项代价较高的应用中,选择排序可能是最佳算法。
要使用标准库实现它,重复使用std::min_element查找剩余的最小元素,并使用iter_swap将其交换到适当的位置:
template<class FwdIt, class Compare = std::less<>>
void selection_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
for (auto it = first; it != last; ++it) {
auto const selection = std::min_element(it, last, cmp);
std::iter_swap(selection, it);
assert(std::is_sorted(first, std::next(it), cmp));
}
}
注意,selection_sort将已经处理过的范围[first, it)排序为它的循环不变量。与std::sort的随机访问迭代器相比,最小的需求是前向迭代器。
细节省略:
选择排序可以通过早期测试进行优化,如果(std::distance(first, last) <= 1)返回;(或者对于正向/双向迭代器:if (first == last || std::next(first) == last)返回;)。
对于双向迭代器,上面的测试可以与间隔[first, std::prev(last))上的循环结合,因为最后一个元素保证是最小的剩余元素,并且不需要交换。
插入排序
虽然它是最坏情况时间为O(N²)的基本排序算法之一,但无论是在数据接近排序时(因为它是自适应的),还是在问题规模较小时(因为它开销低),插入排序都是首选算法。由于这些原因,也因为它是稳定的,插入排序通常被用作更高开销的分治排序算法(如归并排序或快速排序)的递归基本情况(当问题规模较小时)。
要用标准库实现insertion_sort,重复使用std::upper_bound来查找当前元素需要移动的位置,并使用std::rotate将输入范围内的剩余元素向上移动:
template<class FwdIt, class Compare = std::less<>>
void insertion_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
for (auto it = first; it != last; ++it) {
auto const insertion = std::upper_bound(first, it, *it, cmp);
std::rotate(insertion, it, std::next(it));
assert(std::is_sorted(first, std::next(it), cmp));
}
}
注意insertion_sort将已经处理过的范围[first, it)排序为它的循环不变量。插入排序也适用于前向迭代器。
细节省略:
insertion sort can be optimized with an early test if (std::distance(first, last) <= 1) return; (or for forward / bidirectional iterators: if (first == last || std::next(first) == last) return;) and a loop over the interval [std::next(first), last), because the first element is guaranteed to be in place and doesn't require a rotate.
for bidirectional iterators, the binary search to find the insertion point can be replaced with a reverse linear search using the Standard Library's std::find_if_not algorithm.
以下片段的四个现场示例(c++ 14, c++ 11, c++ 98和Boost, c++ 98):
using RevIt = std::reverse_iterator<BiDirIt>;
auto const insertion = std::find_if_not(RevIt(it), RevIt(first),
[=](auto const& elem){ return cmp(*it, elem); }
).base();
对于随机输入,这给出了O(N²)个比较,但对于几乎排序的输入,这改进为O(N)个比较。二分查找总是使用O(N log N)个比较。
对于较小的输入范围,线性搜索的更好的内存位置(缓存,预取)也可能主导二进制搜索(当然,应该测试这一点)。
快速排序
当仔细实现时,快速排序是健壮的,具有O(N log N)的预期复杂性,但有O(N²)的最坏情况复杂性,可以由对手选择的输入数据触发。当不需要稳定排序时,快速排序是一种优秀的通用排序。
Even for the simplest versions, quick sort is quite a bit more complicated to implement using the Standard Library than the other classic sorting algorithms. The approach below uses a few iterator utilities to locate the middle element of the input range [first, last) as the pivot, then use two calls to std::partition (which are O(N)) to three-way partition the input range into segments of elements that are smaller than, equal to, and larger than the selected pivot, respectively. Finally the two outer segments with elements smaller than and larger than the pivot are recursively sorted:
template<class FwdIt, class Compare = std::less<>>
void quick_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
auto const N = std::distance(first, last);
if (N <= 1) return;
auto const pivot = *std::next(first, N / 2);
auto const middle1 = std::partition(first, last, [=](auto const& elem){
return cmp(elem, pivot);
});
auto const middle2 = std::partition(middle1, last, [=](auto const& elem){
return !cmp(pivot, elem);
});
quick_sort(first, middle1, cmp); // assert(std::is_sorted(first, middle1, cmp));
quick_sort(middle2, last, cmp); // assert(std::is_sorted(middle2, last, cmp));
}
然而,快速排序是相当棘手的正确和有效的,因为上面的每个步骤都必须仔细检查和优化生产级代码。特别地,对于O(N log N)复杂度,枢轴必须导致输入数据的平衡分区,这对于O(1)枢轴通常不能保证,但如果将枢轴设置为输入范围的O(N)中位数,则可以保证。
细节省略:
the above implementation is particularly vulnerable to special inputs, e.g. it has O(N^2) complexity for the "organ pipe" input 1, 2, 3, ..., N/2, ... 3, 2, 1 (because the middle is always larger than all other elements).
median-of-3 pivot selection from randomly chosen elements from the input range guards against almost sorted inputs for which the complexity would otherwise deteriorate to O(N^2).
3-way partitioning (separating elements smaller than, equal to and larger than the pivot) as shown by the two calls to std::partition is not the most efficient O(N) algorithm to achieve this result.
for random access iterators, a guaranteed O(N log N) complexity can be achieved through median pivot selection using std::nth_element(first, middle, last), followed by recursive calls to quick_sort(first, middle, cmp) and quick_sort(middle, last, cmp).
this guarantee comes at a cost, however, because the constant factor of the O(N) complexity of std::nth_element can be more expensive than that of the O(1) complexity of a median-of-3 pivot followed by an O(N) call to std::partition (which is a cache-friendly single forward pass over the data).
去排序
如果不考虑使用O(N)额外的空间,那么归并排序是一个很好的选择:它是唯一稳定的O(N log N)排序算法。
使用标准算法实现它很简单:使用一些迭代器实用程序来定位输入范围的中间[第一个,最后一个),并用std::inplace_merge组合两个递归排序的段:
template<class BiDirIt, class Compare = std::less<>>
void merge_sort(BiDirIt first, BiDirIt last, Compare cmp = Compare{})
{
auto const N = std::distance(first, last);
if (N <= 1) return;
auto const middle = std::next(first, N / 2);
merge_sort(first, middle, cmp); // assert(std::is_sorted(first, middle, cmp));
merge_sort(middle, last, cmp); // assert(std::is_sorted(middle, last, cmp));
std::inplace_merge(first, middle, last, cmp); // assert(std::is_sorted(first, last, cmp));
}
归并排序需要双向迭代器,瓶颈是std::inplace_merge。注意,在对链表排序时,归并排序只需要O(log N)额外的空间(用于递归)。后一种算法是通过标准库中的std::list<T>::sort实现的。
堆排序
堆排序实现简单,执行O(N log N)就地排序,但不稳定。
第一个循环,O(N)"heapify"阶段,将数组按堆顺序排列。第二个循环,O(N log N)“排序”阶段,重复提取最大值并恢复堆顺序。标准库使这一点非常简单:
template<class RandomIt, class Compare = std::less<>>
void heap_sort(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
lib::make_heap(first, last, cmp); // assert(std::is_heap(first, last, cmp));
lib::sort_heap(first, last, cmp); // assert(std::is_sorted(first, last, cmp));
}
如果你认为使用std::make_heap和std::sort_heap是“欺骗”,你可以再深入一层,分别用std::push_heap和std::pop_heap来编写这些函数:
namespace lib {
// NOTE: is O(N log N), not O(N) as std::make_heap
template<class RandomIt, class Compare = std::less<>>
void make_heap(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
for (auto it = first; it != last;) {
std::push_heap(first, ++it, cmp);
assert(std::is_heap(first, it, cmp));
}
}
template<class RandomIt, class Compare = std::less<>>
void sort_heap(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
for (auto it = last; it != first;) {
std::pop_heap(first, it--, cmp);
assert(std::is_heap(first, it, cmp));
}
}
} // namespace lib
标准库将push_heap和pop_heap的复杂度指定为O(log N)。但是请注意,在[first, last)范围内的外层循环导致make_heap的复杂度为O(N log N),而std::make_heap的复杂度仅为O(N)。对于heap_sort的总体复杂度O(N log N)来说,这无关紧要。
详细信息省略:O(N) make_heap的实现
测试
这里有四个现场示例(c++ 14, c++ 11, c++ 98和Boost, c++ 98)在各种输入上测试所有五种算法(并不意味着详尽或严格)。只需注意LOC的巨大差异:c++ 11/ c++ 14需要大约130个LOC, c++ 98和Boost需要190个LOC(+50%),而c++ 98需要270多个LOC(+100%)。