谁有一个快速的方法去重复在c#的泛型列表?


当前回答

如果需要比较复杂的对象,则需要在Distinct()方法中传递一个compararer对象。

private void GetDistinctItemList(List<MyListItem> _listWithDuplicates)
{
    //It might be a good idea to create MyListItemComparer 
    //elsewhere and cache it for performance.
    List<MyListItem> _listWithoutDuplicates = _listWithDuplicates.Distinct(new MyListItemComparer()).ToList();
        
    //Choose the line below instead, if you have a situation where there is a chance to change the list while Distinct() is running.
    //ToArray() is used to solve "Collection was modified; enumeration operation may not execute" error.
    //List<MyListItem> _listWithoutDuplicates = _listWithDuplicates.ToArray().Distinct(new MyListItemComparer()).ToList();

    return _listWithoutDuplicates;
}

假设你有另外两个类,比如:

public class MyListItemComparer : IEqualityComparer<MyListItem>
{
    public bool Equals(MyListItem x, MyListItem y)
    {
        return x != null 
               && y != null 
               && x.A == y.A 
               && x.B.Equals(y.B); 
               && x.C.ToString().Equals(y.C.ToString());
    }

    public int GetHashCode(MyListItem codeh)
    {
        return codeh.GetHashCode();
    }
}

And:

public class MyListItem
{
    public int A { get; }
    public string B { get; }
    public MyEnum C { get; }

    public MyListItem(int a, string b, MyEnum c)
    {
        A = a;
        B = b;
        C = c;
    }
}

其他回答

所有的答案要么复制列表,要么创建一个新列表,要么使用慢函数,要么就是慢得令人痛苦。

据我所知,这是我所知道的最快和最便宜的方法(同时,还得到了一个非常有经验的实时物理优化程序员的支持)。

// Duplicates will be noticed after a sort O(nLogn)
list.Sort();

// Store the current and last items. Current item declaration is not really needed, and probably optimized by the compiler, but in case it's not...
int lastItem = -1;
int currItem = -1;

int size = list.Count;

// Store the index pointing to the last item we want to keep in the list
int last = size - 1;

// Travel the items from last to first O(n)
for (int i = last; i >= 0; --i)
{
    currItem = list[i];

    // If this item was the same as the previous one, we don't want it
    if (currItem == lastItem)
    {
        // Overwrite last in current place. It is a swap but we don't need the last
       list[i] = list[last];

        // Reduce the last index, we don't want that one anymore
        last--;
    }

    // A new item, we store it and continue
    else
        lastItem = currItem;
}

// We now have an unsorted list with the duplicates at the end.

// Remove the last items just once
list.RemoveRange(last + 1, size - last - 1);

// Sort again O(n logn)
list.Sort();

最终成本为:

nlogn + n + nlogn = n + 2nlogn = O(nlogn)非常漂亮。

关于RemoveRange注意事项: 由于我们不能设置列表的计数并避免使用Remove函数,我不知道这个操作的确切速度,但我猜这是最快的方法。

把它排序,然后检查两个和两个相邻的,因为重复的会聚集在一起。

就像这样:

list.Sort();
Int32 index = list.Count - 1;
while (index > 0)
{
    if (list[index] == list[index - 1])
    {
        if (index < list.Count - 1)
            (list[index], list[list.Count - 1]) = (list[list.Count - 1], list[index]);
        list.RemoveAt(list.Count - 1);
        index--;
    }
    else
        index--;
}

注:

从后到前进行比较,避免每次移除后都要列出度假胜地列表 这个例子现在使用c#值元组来进行交换,如果你不能使用它,可以用适当的代码来代替 最终结果不再排序

我认为最简单的方法是:

创建一个新列表并添加唯一的项目。

例子:

        class MyList{
    int id;
    string date;
    string email;
    }
    
    List<MyList> ml = new Mylist();

ml.Add(new MyList(){
id = 1;
date = "2020/09/06";
email = "zarezadeh@gmailcom"
});

ml.Add(new MyList(){
id = 2;
date = "2020/09/01";
email = "zarezadeh@gmailcom"
});

 List<MyList> New_ml = new Mylist();

foreach (var item in ml)
                {
                    if (New_ml.Where(w => w.email == item.email).SingleOrDefault() == null)
                    {
                        New_ml.Add(new MyList()
                        {
                          id = item.id,
     date = item.date,
               email = item.email
                        });
                    }
                }

也许您应该考虑使用HashSet。

从MSDN链接:

using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
        HashSet<int> evenNumbers = new HashSet<int>();
        HashSet<int> oddNumbers = new HashSet<int>();

        for (int i = 0; i < 5; i++)
        {
            // Populate numbers with just even numbers.
            evenNumbers.Add(i * 2);

            // Populate oddNumbers with just odd numbers.
            oddNumbers.Add((i * 2) + 1);
        }

        Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
        DisplaySet(evenNumbers);

        Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
        DisplaySet(oddNumbers);

        // Create a new HashSet populated with even numbers.
        HashSet<int> numbers = new HashSet<int>(evenNumbers);
        Console.WriteLine("numbers UnionWith oddNumbers...");
        numbers.UnionWith(oddNumbers);

        Console.Write("numbers contains {0} elements: ", numbers.Count);
        DisplaySet(numbers);
    }

    private static void DisplaySet(HashSet<int> set)
    {
        Console.Write("{");
        foreach (int i in set)
        {
            Console.Write(" {0}", i);
        }
        Console.WriteLine(" }");
    }
}

/* This example produces output similar to the following:
 * evenNumbers contains 5 elements: { 0 2 4 6 8 }
 * oddNumbers contains 5 elements: { 1 3 5 7 9 }
 * numbers UnionWith oddNumbers...
 * numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
 */

如果你有两个类Product和Customer,我们想从它们的列表中删除重复的项

public class Product
{
    public int Id { get; set; }
    public string ProductName { get; set; }
}

public class Customer
{
    public int Id { get; set; }
    public string CustomerName { get; set; }

}

您必须在下面的表单中定义一个泛型类

public class ItemEqualityComparer<T> : IEqualityComparer<T> where T : class
{
    private readonly PropertyInfo _propertyInfo;

    public ItemEqualityComparer(string keyItem)
    {
        _propertyInfo = typeof(T).GetProperty(keyItem, BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
    }

    public bool Equals(T x, T y)
    {
        var xValue = _propertyInfo?.GetValue(x, null);
        var yValue = _propertyInfo?.GetValue(y, null);
        return xValue != null && yValue != null && xValue.Equals(yValue);
    }

    public int GetHashCode(T obj)
    {
        var propertyValue = _propertyInfo.GetValue(obj, null);
        return propertyValue == null ? 0 : propertyValue.GetHashCode();
    }
}

然后,你可以删除列表中重复的项目。

var products = new List<Product>
            {
                new Product{ProductName = "product 1" ,Id = 1,},
                new Product{ProductName = "product 2" ,Id = 2,},
                new Product{ProductName = "product 2" ,Id = 4,},
                new Product{ProductName = "product 2" ,Id = 4,},
            };
var productList = products.Distinct(new ItemEqualityComparer<Product>(nameof(Product.Id))).ToList();

var customers = new List<Customer>
            {
                new Customer{CustomerName = "Customer 1" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
            };
var customerList = customers.Distinct(new ItemEqualityComparer<Customer>(nameof(Customer.Id))).ToList();

这段代码通过Id删除重复项,如果你想通过其他属性删除重复项,你可以更改名称(YourClass.DuplicateProperty)和名称(Customer.CustomerName),然后通过CustomerName属性删除重复项。