在LINQ查询中调用ToList()或ToArray()更好吗?

我经常遇到这样的情况:我想在声明查询的地方对查询进行求值。这通常是因为我需要对它进行多次迭代，计算成本很高。例如:

string raw = "...";
var lines = (from l in raw.Split('\n')
             let ll = l.Trim()
             where !string.IsNullOrEmpty(ll)
             select ll).ToList();

这很好。但是如果我不打算修改结果，那么我也可以调用ToArray()而不是ToList()。

然而，我想知道ToArray()是否通过首先调用ToList()来实现，因此内存效率比只调用ToList()低。

我疯了吗?我是否应该调用ToArray() -在知道内存不会被分配两次的情况下安全可靠?

当前回答

(七年后……)

其他几个(好的)答案集中在将会发生的微观性能差异上。

这篇文章只是一个补充，以提及由数组(T[])产生的IEnumerator<T>与由List<T>返回的IEnumerator之间存在的语义差异。

最好用例子来说明:

IList<int> source = Enumerable.Range(1, 10).ToArray();  // try changing to .ToList()

foreach (var x in source)
{
  if (x == 5)
    source[8] *= 100;
  Console.WriteLine(x);
}

上面的代码将毫无例外地运行，并产生输出:

这表明int[]返回的IEnumarator<int>并不跟踪自枚举器创建以来数组是否被修改过。

Note that I declared the local variable source as an IList<int>. In that way I make sure the C# compiler does not optimze the foreach statement into something which is equivalent to a for (var idx = 0; idx < source.Length; idx++) { /* ... */ } loop. This is something the C# compiler might do if I use var source = ...; instead. In my current version of the .NET framework the actual enumerator used here is a non-public reference-type System.SZArrayHelper+SZGenericArrayEnumerator`1[System.Int32] but of course this is an implementation detail.

现在，如果我将.ToArray()改为.ToList()，我只得到:

其次是一个系统。InvalidOperationException爆炸说:

修改集合;枚举操作可能无法执行。

在这种情况下，底层枚举器是公共可变值类型System. collections . generic . list ' 1+ enumerator [System. collections . generic . list]。Int32](在这种情况下，在IEnumerator<int>框内，因为我使用IList<int>)。

综上所述，List<T>生成的枚举数跟踪列表在枚举过程中是否发生变化，而T[]生成的枚举数则没有。因此，在. tolist()和. toarray()之间进行选择时，请考虑此差异。

人们经常添加一个额外的. toarray()或. tolist()来绕过一个在枚举器的生命周期内跟踪它是否被修改的集合。

(如果有人想知道List<>如何跟踪集合是否被修改，这个类中有一个私有字段_version，每当List<>被更新时，它都会被更改。实际上可以通过简单地删除索引器public T this[int index]的set访问器中增加_version的行来改变List<>的这种行为，就像最近在Dictionary<，>中所做的那样，如另一个答案所述。)

2016-12-20 16:03:00

其他回答

如果在IEnumerable<T>(例如，来自ORM)上使用ToList()，则通常是首选。如果序列的长度在开始时不知道，ToArray()会创建动态长度的集合(如List)，然后将其转换为数组，这将花费额外的时间。

2010-02-01 14:55:21

对于任何有兴趣在其他Linq-to-sql中使用此结果的人，例如

from q in context.MyTable
where myListOrArray.Contains(q.someID)
select q;

那么生成的SQL是相同的，无论你使用List或Array为myListOrArray。现在我知道有些人可能会问为什么在这条语句之前枚举，但从IQueryable vs(列表或数组)生成的SQL之间是有区别的。

2013-01-14 20:06:27

除非您只是需要一个数组来满足其他约束，否则您应该使用ToList。在大多数情况下，ToArray会比ToList分配更多的内存。

两者都使用数组进行存储，但是ToList有一个更灵活的约束。它需要数组至少与集合中的元素数量一样大。如果数组更大，这不是问题。但是ToArray需要数组的大小精确到元素的数量。

为了满足这个约束，ToArray通常比ToList多做一次分配。一旦它有了一个足够大的数组，它就会分配一个完全正确大小的数组，并将元素复制回该数组中。唯一可以避免这种情况的情况是当数组的增长算法恰好与需要存储的元素数量一致时(绝对是少数)。

EDIT

有几个人问我在List<T>值中有额外的未使用内存的后果。

这是一个合理的担忧。如果创建的集合寿命很长，在创建后从未被修改过，并且有很高的机会落在Gen2堆中，那么您可能会更好地预先分配额外的ToArray。

总的来说，我发现这种情况比较罕见。更常见的情况是，大量ToArray调用被立即传递给其他短期内存使用，在这种情况下，ToList显然更好。

这里的关键是分析，分析，再分析更多。

2013-05-01 17:42:23

我发现人们在这里做的其他基准测试都有不足，所以这里是我的尝试。如果你发现我的方法有问题，请告诉我。

/* This is a benchmarking template I use in LINQPad when I want to do a
 * quick performance test. Just give it a couple of actions to test and
 * it will give you a pretty good idea of how long they take compared
 * to one another. It's not perfect: You can expect a 3% error margin
 * under ideal circumstances. But if you're not going to improve
 * performance by more than 3%, you probably don't care anyway.*/
void Main()
{
    // Enter setup code here
    var values = Enumerable.Range(1, 100000)
        .Select(i => i.ToString())
        .ToArray()
        .Select(i => i);
    values.GetType().Dump();
    var actions = new[]
    {
        new TimedAction("ToList", () =>
        {
            values.ToList();
        }),
        new TimedAction("ToArray", () =>
        {
            values.ToArray();
        }),
        new TimedAction("Control", () =>
        {
            foreach (var element in values)
            {
                // do nothing
            }
        }),
        // Add tests as desired
    };
    const int TimesToRun = 1000; // Tweak this as necessary
    TimeActions(TimesToRun, actions);
}


#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
    Stopwatch s = new Stopwatch();
    int length = actions.Length;
    var results = new ActionResult[actions.Length];
    // Perform the actions in their initial order.
    for (int i = 0; i < length; i++)
    {
        var action = actions[i];
        var result = results[i] = new ActionResult { Message = action.Message };
        // Do a dry run to get things ramped up/cached
        result.DryRun1 = s.Time(action.Action, 10);
        result.FullRun1 = s.Time(action.Action, iterations);
    }
    // Perform the actions in reverse order.
    for (int i = length - 1; i >= 0; i--)
    {
        var action = actions[i];
        var result = results[i];
        // Do a dry run to get things ramped up/cached
        result.DryRun2 = s.Time(action.Action, 10);
        result.FullRun2 = s.Time(action.Action, iterations);
    }
    results.Dump();
}

public class ActionResult
{
    public string Message { get; set; }
    public double DryRun1 { get; set; }
    public double DryRun2 { get; set; }
    public double FullRun1 { get; set; }
    public double FullRun2 { get; set; }
}

public class TimedAction
{
    public TimedAction(string message, Action action)
    {
        Message = message;
        Action = action;
    }
    public string Message { get; private set; }
    public Action Action { get; private set; }
}

public static class StopwatchExtensions
{
    public static double Time(this Stopwatch sw, Action action, int iterations)
    {
        sw.Restart();
        for (int i = 0; i < iterations; i++)
        {
            action();
        }
        sw.Stop();

        return sw.Elapsed.TotalMilliseconds;
    }
}
#endregion

你可以在这里下载LINQPad脚本。

结果:

调整上面的代码，你会发现:

当处理较小的数组时，差异就不那么显著了。在处理整型而不是字符串时，这种差异不太显著。使用大型结构体而不是字符串通常会花费更多的时间，但并不会真正改变比例。

这与投票最多的答案的结论一致:

除非您的代码经常生成许多大型数据列表，否则不太可能注意到性能上的差异。(当创建1000个包含100K字符串的列表时，只有200ms的差异。) ToList()始终运行得更快，如果不打算长时间保留结果，那么它是一个更好的选择。

更新

@JonHanna指出，根据Select的实现，ToList()或ToArray()实现可以提前预测结果集合的大小。将上面代码中的. select (i => i)替换为Where(i => true)会产生非常相似的结果，并且更有可能这样做，而不管. net实现如何。

2017-09-08 19:16:54

您应该根据理想的设计选择来决定使用ToList还是ToArray。如果您想要一个只能通过索引迭代和访问的集合，请选择ToArray。如果您希望以后能够轻松地从集合中添加和删除额外的功能，那么可以使用ToList(并不是说您不能添加到数组中，但这通常不是合适的工具)。

如果性能很重要，您还应该考虑哪些操作会更快。实际上，您不会调用ToList或ToArray一百万次，但可能会对获得的集合进行一百万次操作。在这方面[]更好，因为List<>是[]，有一些开销。查看这个线程的一些效率比较:List<int>或int[]

在我自己不久前的测试中，我发现ToArray更快。我不确定这些测试有多偏颇。然而，性能差异是如此微不足道，只有在循环运行这些查询数百万次时才能明显看出。

2012-12-07 10:42:03

在LINQ查询中调用ToList()或ToArray()更好吗?

推荐文章

最新文章

标签