我在一次面试中被问到这个问题。它们都是O(nlogn),但大多数人使用快速排序而不是归并排序。为什么呢?


当前回答

这是采访中经常被问到的一个问题,尽管归并排序在最坏情况下性能更好,但快速排序被认为比归并排序更好,特别是对于大输入。以下是快速排序更好的原因:

1-辅助空间:快速排序是一种就地排序算法。就地排序意味着执行排序不需要额外的存储空间。另一方面,归并排序需要一个临时数组来归并已排序的数组,因此它并不到位。

2-最坏情况:快速排序O(n^2)的最坏情况可以通过使用随机化快速排序来避免。通过选择正确的枢轴,可以很容易地避免这种情况。通过选择合适的枢轴元来获得平均情况下的行为,从而提高了算法的性能,达到了与归并排序一样的效率。

3-引用的局部性:快速排序特别展示了良好的缓存局部性,这使得它在许多情况下比归并排序更快,比如在虚拟内存环境中。

4-尾递归:快速排序是尾递归,而归并排序不是。尾递归函数是一种函数,其中递归调用是函数执行的最后一件事。尾递归函数被认为比非尾递归函数更好,因为尾递归可以被编译器优化。

其他回答

快速排序有O(n2)最差情况运行时和O(nlogn)平均情况运行时。然而,在许多情况下,它优于归并排序,因为许多因素影响算法的运行时,并且,当把它们放在一起时,快速排序胜出。

In particular, the often-quoted runtime of sorting algorithms refers to the number of comparisons or the number of swaps necessary to perform to sort the data. This is indeed a good measure of performance, especially since it’s independent of the underlying hardware design. However, other things – such as locality of reference (i.e. do we read lots of elements which are probably in cache?) – also play an important role on current hardware. Quicksort in particular requires little additional space and exhibits good cache locality, and this makes it faster than merge sort in many cases.

此外,通过使用适当的枢轴选择,几乎完全避免快速排序的最坏情况运行时间O(n2)是非常容易的——比如随机选择它(这是一个很好的策略)。

在实践中,许多现代的快速排序实现(特别是libstdc++的std::sort)实际上是introsort,其理论上的最差情况是O(nlogn),与归并排序相同。它通过限制递归深度,并在超过logn时切换到不同的算法(heapsort)来实现这一点。

当我试验这两种排序算法时,通过计算递归调用的次数, 快速排序始终比归并排序具有更少的递归调用。 这是因为快速排序有枢轴,而在下一个递归调用中不包括枢轴。这样快速排序可以比归并排序更快地达到递归基本情况。

维基百科的解释是:

通常,快速排序在实践中比其他Θ(nlogn)算法要快得多,因为它的内部循环可以在大多数架构上有效地实现,并且在大多数现实数据中,可以做出设计选择,使需要二次时间的概率最小化。

快速排序

Mergesort

我认为归并排序(即Ω(n))所需要的存储量也存在快速排序实现所不具备的问题。在最坏的情况下,它们的算法时间是相同的,但归并排序需要更多的存储空间。

One of the reason is more philosophical. Quicksort is Top->Down philosophy. With n elements to sort, there are n! possibilities. With 2 partitions of m & n-m which are mutually exclusive, the number of possibilities go down in several orders of magnitude. m! * (n-m)! is smaller by several orders than n! alone. imagine 5! vs 3! *2!. 5! has 10 times more possibilities than 2 partitions of 2 & 3 each . and extrapolate to 1 million factorial vs 900K!*100K! vs. So instead of worrying about establishing any order within a range or a partition,just establish order at a broader level in partitions and reduce the possibilities within a partition. Any order established earlier within a range will be disturbed later if the partitions themselves are not mutually exclusive.

任何自下而上的排序方法,如归并排序或堆排序,就像工人或雇员的方法一样,人们很早就开始在微观层面进行比较。但是,一旦在它们之间发现了一个元素,这个顺序就必然会丢失。这些方法非常稳定和可预测,但要做一定量的额外工作。

Quick Sort is like Managerial approach where one is not initially concerned about any order , only about meeting a broad criterion with No regard for order. Then the partitions are narrowed until you get a sorted set. The real challenge in Quicksort is in finding a partition or criterion in the dark when you know nothing about the elements to sort. That is why we either need to spend some effort to find a median value or pick 1 at random or some arbitrary "Managerial" approach . To find a perfect median can take significant amount of effort and leads to a stupid bottom up approach again. So Quicksort says just a pick a random pivot and hope that it will be somewhere in the middle or do some work to find median of 3 , 5 or something more to find a better median but do not plan to be perfect & don't waste any time in initially ordering. That seems to do well if you are lucky or sometimes degrades to n^2 when you don't get a median but just take a chance. Any way data is random. right. So I agree more with the top ->down logical approach of quicksort & it turns out that the chance it takes about pivot selection & comparisons that it saves earlier seems to work better more times than any meticulous & thorough stable bottom ->up approach like merge sort. But

与归并排序不同,快速排序不使用辅助空间。而归并排序使用辅助空间O(n)。 归并排序的最坏情况时间复杂度是O(nlogn)而快速排序的最坏情况复杂度是O(n²)这发生在数组已经排序的时候。