我相信有一种方法可以找到长度为n的O(n)无序数组中第k大的元素。也可能是期望O(n)之类的。我们该怎么做呢?
当前回答
还有Wirth的选择算法,它的实现比QuickSelect简单。Wirth的选择算法比QuickSelect慢,但经过一些改进,它变得更快。
更详细地说。使用Vladimir Zabrodsky的MODIFIND优化和3中位数的枢轴选择,并注意算法划分部分的最后步骤,我提出了以下算法(想象一下,命名为“LefSelect”):
#define F_SWAP(a,b) { float temp=(a);(a)=(b);(b)=temp; }
# Note: The code needs more than 2 elements to work
float lefselect(float a[], const int n, const int k) {
int l=0, m = n-1, i=l, j=m;
float x;
while (l<m) {
if( a[k] < a[i] ) F_SWAP(a[i],a[k]);
if( a[j] < a[i] ) F_SWAP(a[i],a[j]);
if( a[j] < a[k] ) F_SWAP(a[k],a[j]);
x=a[k];
while (j>k & i<k) {
do i++; while (a[i]<x);
do j--; while (a[j]>x);
F_SWAP(a[i],a[j]);
}
i++; j--;
if (j<k) {
while (a[i]<x) i++;
l=i; j=m;
}
if (k<i) {
while (x<a[j]) j--;
m=j; i=l;
}
}
return a[k];
}
在我这里做的基准测试中,LefSelect比QuickSelect快20-30%。
其他回答
对于k非常小的值(即k << n),我们可以在~O(n)时间内完成。否则,如果k与n比较,我们得到O(nlogn)
虽然不是很确定O(n)复杂度,但肯定在O(n)和nLog(n)之间。也肯定更接近于O(n)而不是nLog(n)函数是用Java编写的
public int quickSelect(ArrayList<Integer>list, int nthSmallest){
//Choose random number in range of 0 to array length
Random random = new Random();
//This will give random number which is not greater than length - 1
int pivotIndex = random.nextInt(list.size() - 1);
int pivot = list.get(pivotIndex);
ArrayList<Integer> smallerNumberList = new ArrayList<Integer>();
ArrayList<Integer> greaterNumberList = new ArrayList<Integer>();
//Split list into two.
//Value smaller than pivot should go to smallerNumberList
//Value greater than pivot should go to greaterNumberList
//Do nothing for value which is equal to pivot
for(int i=0; i<list.size(); i++){
if(list.get(i)<pivot){
smallerNumberList.add(list.get(i));
}
else if(list.get(i)>pivot){
greaterNumberList.add(list.get(i));
}
else{
//Do nothing
}
}
//If smallerNumberList size is greater than nthSmallest value, nthSmallest number must be in this list
if(nthSmallest < smallerNumberList.size()){
return quickSelect(smallerNumberList, nthSmallest);
}
//If nthSmallest is greater than [ list.size() - greaterNumberList.size() ], nthSmallest number must be in this list
//The step is bit tricky. If confusing, please see the above loop once again for clarification.
else if(nthSmallest > (list.size() - greaterNumberList.size())){
//nthSmallest will have to be changed here. [ list.size() - greaterNumberList.size() ] elements are already in
//smallerNumberList
nthSmallest = nthSmallest - (list.size() - greaterNumberList.size());
return quickSelect(greaterNumberList,nthSmallest);
}
else{
return pivot;
}
}
function nthMax(arr, nth = 1, maxNumber = Infinity) {
let large = -Infinity;
for(e of arr) {
if(e > large && e < maxNumber ) {
large = e;
} else if (maxNumber == large) {
nth++;
}
}
return nth==0 ? maxNumber: nthMax(arr, nth-1, large);
}
let array = [11,12,12,34,23,34];
let secondlargest = nthMax(array, 1);
console.log("Number:", secondlargest);
Haskell的解决方案:
kthElem index list = sort list !! index
withShape ~[] [] = []
withShape ~(x:xs) (y:ys) = x : withShape xs ys
sort [] = []
sort (x:xs) = (sort ls `withShape` ls) ++ [x] ++ (sort rs `withShape` rs)
where
ls = filter (< x)
rs = filter (>= x)
这通过使用withShape方法来实现中值解的中值,从而发现分区的大小,而无需实际计算分区大小。
这叫做求k阶统计量。有一个非常简单的随机算法(叫做quickselect),平均时间为O(n),最坏情况时间为O(n²),还有一个相当复杂的非随机算法(叫做introselect),最坏情况时间为O(n)。维基百科上有一些信息,但不是很好。
你需要的一切都在这些幻灯片里。只需提取O(n)最坏情况算法(introselect)的基本算法:
Select(A,n,i):
Divide input into ⌈n/5⌉ groups of size 5.
/* Partition on median-of-medians */
medians = array of each group’s median.
pivot = Select(medians, ⌈n/5⌉, ⌈n/10⌉)
Left Array L and Right Array G = partition(A, pivot)
/* Find ith element in L, pivot, or G */
k = |L| + 1
If i = k, return pivot
If i < k, return Select(L, k-1, i)
If i > k, return Select(G, n-k, i-k)
在Cormen等人的《算法介绍》一书中也有非常详细的描述。