如何从c#数组中删除重复项?

我一直在使用从函数调用中返回的c#字符串[]数组。我可以强制转换为Generic集合，但我想知道是否有更好的方法，可能是使用临时数组。

从c#数组中删除重复项的最佳方法是什么?

当前回答

最好的方法是什么?很难说，HashSet方法看起来很快，但是(取决于数据)使用排序算法(CountSort ?) 可以快得多。

using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
    static void Main()
    {
        Random r = new Random(0); int[] a, b = new int[1000000];
        for (int i = b.Length - 1; i >= 0; i--) b[i] = r.Next(b.Length);
        a = new int[b.Length]; Array.Copy(b, a, b.Length);
        a = dedup0(a); Console.WriteLine(a.Length);
        a = new int[b.Length]; Array.Copy(b, a, b.Length);
        var w = System.Diagnostics.Stopwatch.StartNew();
        a = dedup0(a); Console.WriteLine(w.Elapsed); Console.Read();
    }

    static int[] dedup0(int[] a)  // 48 ms  
    {
        return new HashSet<int>(a).ToArray();
    }

    static int[] dedup1(int[] a)  // 68 ms
    {
        Array.Sort(a); int i = 0, j = 1, k = a.Length; if (k < 2) return a;
        while (j < k) if (a[i] == a[j]) j++; else a[++i] = a[j++];
        Array.Resize(ref a, i + 1); return a;
    }

    static int[] dedup2(int[] a)  //  8 ms
    {
        var b = new byte[a.Length]; int c = 0;
        for (int i = 0; i < a.Length; i++) 
            if (b[a[i]] == 0) { b[a[i]] = 1; c++; }
        a = new int[c];
        for (int j = 0, i = 0; i < b.Length; i++) if (b[i] > 0) a[j++] = i;
        return a;
    }
}

几乎没有分支。怎么做?调试模式，步进(F11)与一个小数组:{1,3,1,1,0}

    static int[] dedupf(int[] a)  //  4 ms
    {
        if (a.Length < 2) return a;
        var b = new byte[a.Length]; int c = 0, bi, ai, i, j;
        for (i = 0; i < a.Length; i++)
        { ai = a[i]; bi = 1 ^ b[ai]; b[ai] |= (byte)bi; c += bi; }
        a = new int[c]; i = 0; while (b[i] == 0) i++; a[0] = i++;
        for (j = 0; i < b.Length; i++) a[j += bi = b[i]] += bi * i; return a;
    }

有两个嵌套循环的解决方案可能需要一些时间，特别是对于较大的数组。

    static int[] dedup(int[] a)
    {
        int i, j, k = a.Length - 1;
        for (i = 0; i < k; i++)
            for (j = i + 1; j <= k; j++) if (a[i] == a[j]) a[j--] = a[k--];
        Array.Resize(ref a, k + 1); return a;
    }

2020-05-24 08:57:45

其他回答

这里有一个O(n*n)方法，它使用O(1)空间。

void removeDuplicates(char* strIn)
{
    int numDups = 0, prevIndex = 0;
    if(NULL != strIn && *strIn != '\0')
    {
        int len = strlen(strIn);
        for(int i = 0; i < len; i++)
        {
            bool foundDup = false;
            for(int j = 0; j < i; j++)
            {
                if(strIn[j] == strIn[i])
                {
                    foundDup = true;
                    numDups++;
                    break;
                }
            }

            if(foundDup == false)
            {
                strIn[prevIndex] = strIn[i];
                prevIndex++;
            }
        }

        strIn[len-numDups] = '\0';
    }
}

上面的哈希/linq方法是你在现实生活中通常会使用的方法。然而，在面试中，他们通常想要设置一些限制，例如常量空间，这就排除了哈希或没有内部api——这就排除了使用LINQ。

2009-12-06 17:50:22

下面是HashSet<string>方法:

public static string[] RemoveDuplicates(string[] s)
{
    HashSet<string> set = new HashSet<string>(s);
    string[] result = new string[set.Count];
    set.CopyTo(result);
    return result;
}

不幸的是，这个解决方案也需要。net框架3.5或更高版本，因为HashSet直到该版本才被添加。你也可以使用array.Distinct()，这是LINQ的一个特性。

2008-08-13 13:50:14

在下面找到答案。

class Program
{
    static void Main(string[] args)
    {
        var nums = new int[] { 1, 4, 3, 3, 3, 5, 5, 7, 7, 7, 7, 9, 9, 9 };
        var result = removeDuplicates(nums);
        foreach (var item in result)
        {
            Console.WriteLine(item);
        }
    }
    static int[] removeDuplicates(int[] nums)
    {
        nums = nums.ToList().OrderBy(c => c).ToArray();
        int j = 1;
        int i = 0;
        int stop = 0;
        while (j < nums.Length)
        {
            if (nums[i] != nums[j])
            {
                nums[i + 1] = nums[j];
                stop = i + 2;
                i++;
            }
            j++;
        }
        nums = nums.Take(stop).ToArray();
        return nums;
    }
}

这是基于我刚刚解决的一个测试的一点贡献，可能对这里其他顶级贡献者的改进有所帮助。以下是我所做的事情:

I used OrderBy which allows me order or sort the items from smallest to the highest using LINQ I then convert it to back to an array and then re-assign it back to the primary datasource So i then initialize j which is my right hand side of the array to be 1 and i which is my left hand side of the array to be 0, i also initialize where i would i to stop to be 0. I used a while loop to increment through the array by going from one position to the other left to right, for each increment the stop position is the current value of i + 2 which i will use later to truncate the duplicates from the array. I then increment by moving from left to right from the if statement and from right to right outside of the if statement until i iterate through the entire values of the array. I then pick from the first element to the stop position which becomes the last i index plus 2. that way i am able to remove all the duplicate items from the int array. which is then reassigned.

2020-10-25 00:10:43

strINvalues = "1,1,2,2,3,3,4,4";
strINvalues = string.Join(",", strINvalues .Split(',').Distinct().ToArray());
Debug.Writeline(strINvalues);

不确定这是巫术还是漂亮的代码

1 strINvalues .Split('，').Distinct().ToArray()

2字符串。加入(”、“XXX);

1拆分数组，使用Distinct [LINQ]删除重复项 2 .在没有副本的情况下将其连接回去。

抱歉，我从来没有读过StackOverFlow上的文本，只是代码。它比文本更有意义;)

2019-10-03 16:23:56

注意:未测试!

string[] test(string[] myStringArray)
{
    List<String> myStringList = new List<string>();
    foreach (string s in myStringArray)
    {
        if (!myStringList.Contains(s))
        {
            myStringList.Add(s);
        }
    }
    return myStringList.ToString();
}

也许能满足你的需要…

编辑啊! !不到一分钟就被抢了!

2008-08-13 13:09:23

如何从c#数组中删除重复项?

推荐文章

最新文章

标签