我相信有一种方法可以找到长度为n的O(n)无序数组中第k大的元素。也可能是期望O(n)之类的。我们该怎么做呢?


当前回答

还有Wirth的选择算法,它的实现比QuickSelect简单。Wirth的选择算法比QuickSelect慢,但经过一些改进,它变得更快。

更详细地说。使用Vladimir Zabrodsky的MODIFIND优化和3中位数的枢轴选择,并注意算法划分部分的最后步骤,我提出了以下算法(想象一下,命名为“LefSelect”):

#define F_SWAP(a,b) { float temp=(a);(a)=(b);(b)=temp; }

# Note: The code needs more than 2 elements to work
float lefselect(float a[], const int n, const int k) {
    int l=0, m = n-1, i=l, j=m;
    float x;

    while (l<m) {
        if( a[k] < a[i] ) F_SWAP(a[i],a[k]);
        if( a[j] < a[i] ) F_SWAP(a[i],a[j]);
        if( a[j] < a[k] ) F_SWAP(a[k],a[j]);

        x=a[k];
        while (j>k & i<k) {
            do i++; while (a[i]<x);
            do j--; while (a[j]>x);

            F_SWAP(a[i],a[j]);
        }
        i++; j--;

        if (j<k) {
            while (a[i]<x) i++;
            l=i; j=m;
        }
        if (k<i) {
            while (x<a[j]) j--;
            m=j; i=l;
        }
    }
    return a[k];
}

在我这里做的基准测试中,LefSelect比QuickSelect快20-30%。

其他回答

对于k非常小的值(即k << n),我们可以在~O(n)时间内完成。否则,如果k与n比较,我们得到O(nlogn)

虽然不是很确定O(n)复杂度,但肯定在O(n)和nLog(n)之间。也肯定更接近于O(n)而不是nLog(n)函数是用Java编写的

public int quickSelect(ArrayList<Integer>list, int nthSmallest){
    //Choose random number in range of 0 to array length
    Random random =  new Random();
    //This will give random number which is not greater than length - 1
    int pivotIndex = random.nextInt(list.size() - 1); 

    int pivot = list.get(pivotIndex);

    ArrayList<Integer> smallerNumberList = new ArrayList<Integer>();
    ArrayList<Integer> greaterNumberList = new ArrayList<Integer>();

    //Split list into two. 
    //Value smaller than pivot should go to smallerNumberList
    //Value greater than pivot should go to greaterNumberList
    //Do nothing for value which is equal to pivot
    for(int i=0; i<list.size(); i++){
        if(list.get(i)<pivot){
            smallerNumberList.add(list.get(i));
        }
        else if(list.get(i)>pivot){
            greaterNumberList.add(list.get(i));
        }
        else{
            //Do nothing
        }
    }

    //If smallerNumberList size is greater than nthSmallest value, nthSmallest number must be in this list 
    if(nthSmallest < smallerNumberList.size()){
        return quickSelect(smallerNumberList, nthSmallest);
    }
    //If nthSmallest is greater than [ list.size() - greaterNumberList.size() ], nthSmallest number must be in this list
    //The step is bit tricky. If confusing, please see the above loop once again for clarification.
    else if(nthSmallest > (list.size() - greaterNumberList.size())){
        //nthSmallest will have to be changed here. [ list.size() - greaterNumberList.size() ] elements are already in 
        //smallerNumberList
        nthSmallest = nthSmallest - (list.size() - greaterNumberList.size());
        return quickSelect(greaterNumberList,nthSmallest);
    }
    else{
        return pivot;
    }
}
    function nthMax(arr, nth = 1, maxNumber = Infinity) {
      let large = -Infinity;
      for(e of arr) {
        if(e > large && e < maxNumber ) {
          large = e;
        } else if (maxNumber == large) {
          nth++;
        }
      }
      return nth==0 ? maxNumber: nthMax(arr, nth-1, large);
    }

    let array = [11,12,12,34,23,34];

    let secondlargest = nthMax(array, 1);

    console.log("Number:", secondlargest);

Haskell的解决方案:

kthElem index list = sort list !! index

withShape ~[]     []     = []
withShape ~(x:xs) (y:ys) = x : withShape xs ys

sort []     = []
sort (x:xs) = (sort ls `withShape` ls) ++ [x] ++ (sort rs `withShape` rs)
  where
   ls = filter (<  x)
   rs = filter (>= x)

这通过使用withShape方法来实现中值解的中值,从而发现分区的大小,而无需实际计算分区大小。

这叫做求k阶统计量。有一个非常简单的随机算法(叫做quickselect),平均时间为O(n),最坏情况时间为O(n²),还有一个相当复杂的非随机算法(叫做introselect),最坏情况时间为O(n)。维基百科上有一些信息,但不是很好。

你需要的一切都在这些幻灯片里。只需提取O(n)最坏情况算法(introselect)的基本算法:

Select(A,n,i):
    Divide input into ⌈n/5⌉ groups of size 5.

    /* Partition on median-of-medians */
    medians = array of each group’s median.
    pivot = Select(medians, ⌈n/5⌉, ⌈n/10⌉)
    Left Array L and Right Array G = partition(A, pivot)

    /* Find ith element in L, pivot, or G */
    k = |L| + 1
    If i = k, return pivot
    If i < k, return Select(L, k-1, i)
    If i > k, return Select(G, n-k, i-k)

在Cormen等人的《算法介绍》一书中也有非常详细的描述。