如何在O(n)中找到长度为n的无序数组中的第k大元素?

我相信有一种方法可以找到长度为n的O(n)无序数组中第k大的元素。也可能是期望O(n)之类的。我们该怎么做呢?

当前回答

下面是一个随机化快速选择的c++实现。这个想法是随机选择一个主元。为了实现随机分区，我们使用一个随机函数rand()来生成l和r之间的索引，将随机生成索引处的元素与最后一个元素交换，最后调用以最后一个元素为枢轴的标准分区过程。

#include<iostream>
#include<climits>
#include<cstdlib>
using namespace std;

int randomPartition(int arr[], int l, int r);

// This function returns k'th smallest element in arr[l..r] using
// QuickSort based method.  ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
    // If k is smaller than number of elements in array
    if (k > 0 && k <= r - l + 1)
    {
        // Partition the array around a random element and
        // get position of pivot element in sorted array
        int pos = randomPartition(arr, l, r);

        // If position is same as k
        if (pos-l == k-1)
            return arr[pos];
        if (pos-l > k-1)  // If position is more, recur for left subarray
            return kthSmallest(arr, l, pos-1, k);

        // Else recur for right subarray
        return kthSmallest(arr, pos+1, r, k-pos+l-1);
    }

    // If k is more than number of elements in array
    return INT_MAX;
}

void swap(int *a, int *b)
{
    int temp = *a;
    *a = *b;
    *b = temp;
}

// Standard partition process of QuickSort().  It considers the last
// element as pivot and moves all smaller element to left of it and
// greater elements to right. This function is used by randomPartition()
int partition(int arr[], int l, int r)
{
    int x = arr[r], i = l;
    for (int j = l; j <= r - 1; j++)
    {
        if (arr[j] <= x) //arr[i] is bigger than arr[j] so swap them
        {
            swap(&arr[i], &arr[j]);
            i++;
        }
    }
    swap(&arr[i], &arr[r]); // swap the pivot
    return i;
}

// Picks a random pivot element between l and r and partitions
// arr[l..r] around the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
    int n = r-l+1;
    int pivot = rand() % n;
    swap(&arr[l + pivot], &arr[r]);
    return partition(arr, l, r);
}

// Driver program to test above methods
int main()
{
    int arr[] = {12, 3, 5, 7, 4, 19, 26};
    int n = sizeof(arr)/sizeof(arr[0]), k = 3;
    cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
    return 0;
}

上述解的最坏情况时间复杂度仍为O(n2)。在最坏的情况下，随机函数可能总是选择一个角元素。上述随机化QuickSelect的期望时间复杂度为Θ(n)

2015-04-19 11:47:20

其他回答

虽然不是很确定O(n)复杂度，但肯定在O(n)和nLog(n)之间。也肯定更接近于O(n)而不是nLog(n)函数是用Java编写的

public int quickSelect(ArrayList<Integer>list, int nthSmallest){
    //Choose random number in range of 0 to array length
    Random random =  new Random();
    //This will give random number which is not greater than length - 1
    int pivotIndex = random.nextInt(list.size() - 1); 

    int pivot = list.get(pivotIndex);

    ArrayList<Integer> smallerNumberList = new ArrayList<Integer>();
    ArrayList<Integer> greaterNumberList = new ArrayList<Integer>();

    //Split list into two. 
    //Value smaller than pivot should go to smallerNumberList
    //Value greater than pivot should go to greaterNumberList
    //Do nothing for value which is equal to pivot
    for(int i=0; i<list.size(); i++){
        if(list.get(i)<pivot){
            smallerNumberList.add(list.get(i));
        }
        else if(list.get(i)>pivot){
            greaterNumberList.add(list.get(i));
        }
        else{
            //Do nothing
        }
    }

    //If smallerNumberList size is greater than nthSmallest value, nthSmallest number must be in this list 
    if(nthSmallest < smallerNumberList.size()){
        return quickSelect(smallerNumberList, nthSmallest);
    }
    //If nthSmallest is greater than [ list.size() - greaterNumberList.size() ], nthSmallest number must be in this list
    //The step is bit tricky. If confusing, please see the above loop once again for clarification.
    else if(nthSmallest > (list.size() - greaterNumberList.size())){
        //nthSmallest will have to be changed here. [ list.size() - greaterNumberList.size() ] elements are already in 
        //smallerNumberList
        nthSmallest = nthSmallest - (list.size() - greaterNumberList.size());
        return quickSelect(greaterNumberList,nthSmallest);
    }
    else{
        return pivot;
    }
}

2011-08-18 07:25:06

在那个('第k大元素数组')上快速谷歌返回这个:http://discuss.joelonsoftware.com/default.asp?interview.11.509587.17

"Make one pass through tracking the three largest values so far."

(它是专门为3d最大)

这个答案是:

Build a heap/priority queue.  O(n)
Pop top element.  O(log n)
Pop top element.  O(log n)
Pop top element.  O(log n)

Total = O(n) + 3 O(log n) = O(n)

2008-10-30 21:12:11

你可以用O(n + kn) = O(n)(对于常数k)表示时间，用O(k)表示空间，通过跟踪你见过的最大的k个元素。

对于数组中的每个元素，您可以扫描k个最大的元素列表，并将最小的元素替换为更大的新元素。

Warren的优先级堆解决方案更简洁。

2008-10-30 21:17:49

首先，我们可以从未排序的数组中构建一个BST，它需要O(n)时间，从BST中我们可以找到O(log(n))中第k个最小的元素，它的总计数为O(n)。

2013-09-17 11:30:12

如果你想要一个真正的O(n)算法，而不是O(kn)或类似的算法，那么你应该使用快速选择(它基本上是快速排序，你会丢弃你不感兴趣的分区)。我的教授写了一篇很棒的文章，包括运行时分析:(参考)

QuickSelect算法可以快速找到包含n个元素的无序数组中的第k个最小元素。这是一个随机算法，所以我们计算最坏情况下的预期运行时间。

这是算法。

QuickSelect(A, k)
  let r be chosen uniformly at random in the range 1 to length(A)
  let pivot = A[r]
  let A1, A2 be new arrays
  # split into a pile A1 of small elements and A2 of big elements
  for i = 1 to n
    if A[i] < pivot then
      append A[i] to A1
    else if A[i] > pivot then
      append A[i] to A2
    else
      # do nothing
  end for
  if k <= length(A1):
    # it's in the pile of small elements
    return QuickSelect(A1, k)
  else if k > length(A) - length(A2)
    # it's in the pile of big elements
    return QuickSelect(A2, k - (length(A) - length(A2))
  else
    # it's equal to the pivot
    return pivot

这个算法的运行时间是多少?如果对手为我们抛硬币，我们可能会发现主元总是最大的元素，k总是1，给出的运行时间为

T(n) = Theta(n) + T(n-1) = Theta(n²)

但如果选择确实是随机的，则预期运行时间由

T(n) <= Theta(n) + (1/n) ∑_{i=1 to n}T(max(i, n-i-1))

我们做了一个不完全合理的假设递归总是落在A1或A2中较大的那个。

让我们猜测对于某个a T(n) <= an，然后我们得到

T(n) 
 <= cn + (1/n) ∑_{i=1 to n}T(max(i-1, n-i))
 = cn + (1/n) ∑_{i=1 to floor(n/2)} T(n-i) + (1/n) ∑_{i=floor(n/2)+1 to n} T(i)
 <= cn + 2 (1/n) ∑_{i=floor(n/2) to n} T(i)
 <= cn + 2 (1/n) ∑_{i=floor(n/2) to n} ai

现在我们要用加号右边这个可怕的和来吸收左边的cn。如果我们将其限定为2(1/n)∑i=n/2到n an，我们大致得到2(1/n)(n/2)an = an。但是这个太大了，没有多余的空间来挤进一个cn。让我们用等差级数公式展开和:

∑_{i=floor(n/2) to n} i  
 = ∑_{i=1 to n} i - ∑_{i=1 to floor(n/2)} i  
 = n(n+1)/2 - floor(n/2)(floor(n/2)+1)/2  
 <= n²/2 - (n/4)²/2  
 = (15/32)n²

我们利用n“足够大”的优势，用更干净(更小)的n/4替换丑陋的地板(n/2)因子。现在我们可以继续

cn + 2 (1/n) ∑_{i=floor(n/2) to n} ai,
 <= cn + (2a/n) (15/32) n²
 = n (c + (15/16)a)
 <= an

提供了> 16c。

得到T(n) = O(n)显然是(n)所以我们得到T(n) = (n)

2008-10-31 22:23:11

如何在O(n)中找到长度为n的无序数组中的第k大元素?

推荐文章

最新文章

标签