在LINQ查询中调用ToList()或ToArray()更好吗?

我经常遇到这样的情况:我想在声明查询的地方对查询进行求值。这通常是因为我需要对它进行多次迭代，计算成本很高。例如:

string raw = "...";
var lines = (from l in raw.Split('\n')
             let ll = l.Trim()
             where !string.IsNullOrEmpty(ll)
             select ll).ToList();

这很好。但是如果我不打算修改结果，那么我也可以调用ToArray()而不是ToList()。

然而，我想知道ToArray()是否通过首先调用ToList()来实现，因此内存效率比只调用ToList()低。

我疯了吗?我是否应该调用ToArray() -在知道内存不会被分配两次的情况下安全可靠?

当前回答

我发现人们在这里做的其他基准测试都有不足，所以这里是我的尝试。如果你发现我的方法有问题，请告诉我。

/* This is a benchmarking template I use in LINQPad when I want to do a
 * quick performance test. Just give it a couple of actions to test and
 * it will give you a pretty good idea of how long they take compared
 * to one another. It's not perfect: You can expect a 3% error margin
 * under ideal circumstances. But if you're not going to improve
 * performance by more than 3%, you probably don't care anyway.*/
void Main()
{
    // Enter setup code here
    var values = Enumerable.Range(1, 100000)
        .Select(i => i.ToString())
        .ToArray()
        .Select(i => i);
    values.GetType().Dump();
    var actions = new[]
    {
        new TimedAction("ToList", () =>
        {
            values.ToList();
        }),
        new TimedAction("ToArray", () =>
        {
            values.ToArray();
        }),
        new TimedAction("Control", () =>
        {
            foreach (var element in values)
            {
                // do nothing
            }
        }),
        // Add tests as desired
    };
    const int TimesToRun = 1000; // Tweak this as necessary
    TimeActions(TimesToRun, actions);
}


#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
    Stopwatch s = new Stopwatch();
    int length = actions.Length;
    var results = new ActionResult[actions.Length];
    // Perform the actions in their initial order.
    for (int i = 0; i < length; i++)
    {
        var action = actions[i];
        var result = results[i] = new ActionResult { Message = action.Message };
        // Do a dry run to get things ramped up/cached
        result.DryRun1 = s.Time(action.Action, 10);
        result.FullRun1 = s.Time(action.Action, iterations);
    }
    // Perform the actions in reverse order.
    for (int i = length - 1; i >= 0; i--)
    {
        var action = actions[i];
        var result = results[i];
        // Do a dry run to get things ramped up/cached
        result.DryRun2 = s.Time(action.Action, 10);
        result.FullRun2 = s.Time(action.Action, iterations);
    }
    results.Dump();
}

public class ActionResult
{
    public string Message { get; set; }
    public double DryRun1 { get; set; }
    public double DryRun2 { get; set; }
    public double FullRun1 { get; set; }
    public double FullRun2 { get; set; }
}

public class TimedAction
{
    public TimedAction(string message, Action action)
    {
        Message = message;
        Action = action;
    }
    public string Message { get; private set; }
    public Action Action { get; private set; }
}

public static class StopwatchExtensions
{
    public static double Time(this Stopwatch sw, Action action, int iterations)
    {
        sw.Restart();
        for (int i = 0; i < iterations; i++)
        {
            action();
        }
        sw.Stop();

        return sw.Elapsed.TotalMilliseconds;
    }
}
#endregion

你可以在这里下载LINQPad脚本。

结果:

调整上面的代码，你会发现:

当处理较小的数组时，差异就不那么显著了。在处理整型而不是字符串时，这种差异不太显著。使用大型结构体而不是字符串通常会花费更多的时间，但并不会真正改变比例。

这与投票最多的答案的结论一致:

除非您的代码经常生成许多大型数据列表，否则不太可能注意到性能上的差异。(当创建1000个包含100K字符串的列表时，只有200ms的差异。) ToList()始终运行得更快，如果不打算长时间保留结果，那么它是一个更好的选择。

更新

@JonHanna指出，根据Select的实现，ToList()或ToArray()实现可以提前预测结果集合的大小。将上面代码中的. select (i => i)替换为Where(i => true)会产生非常相似的结果，并且更有可能这样做，而不管. net实现如何。

2017-09-08 19:16:54

其他回答

这是一个老问题了——但是为了方便无意中发现它的用户，还有一种“Memoizing”Enumerable的替代方案——它具有缓存和停止Linq语句的多个枚举的效果，这就是ToArray()和ToList()经常被使用的原因，即使列表或数组的集合属性从未被使用。

Memoize在RX/System中可用。交互式库，并在这里解释: 更多LINQ与系统。互动

(摘自Bart De 's met的博客，如果你经常使用Linq to Objects，强烈推荐你阅读)

2011-11-14 10:40:03

您应该根据理想的设计选择来决定使用ToList还是ToArray。如果您想要一个只能通过索引迭代和访问的集合，请选择ToArray。如果您希望以后能够轻松地从集合中添加和删除额外的功能，那么可以使用ToList(并不是说您不能添加到数组中，但这通常不是合适的工具)。

如果性能很重要，您还应该考虑哪些操作会更快。实际上，您不会调用ToList或ToArray一百万次，但可能会对获得的集合进行一百万次操作。在这方面[]更好，因为List<>是[]，有一些开销。查看这个线程的一些效率比较:List<int>或int[]

在我自己不久前的测试中，我发现ToArray更快。我不确定这些测试有多偏颇。然而，性能差异是如此微不足道，只有在循环运行这些查询数百万次时才能明显看出。

2012-12-07 10:42:03

我同意@mquander的观点，性能差异应该是微不足道的。但是，我想对它进行基准测试，所以我这样做了——结果是微不足道的。

Testing with List<T> source:
ToArray time: 1934 ms (0.01934 ms/call), memory used: 4021 bytes/array
ToList  time: 1902 ms (0.01902 ms/call), memory used: 4045 bytes/List

Testing with array source:
ToArray time: 1957 ms (0.01957 ms/call), memory used: 4021 bytes/array
ToList  time: 2022 ms (0.02022 ms/call), memory used: 4045 bytes/List

每个源数组/列表有1000个元素。所以你可以看到时间和记忆的差异都可以忽略不计。

我的结论是:您还可以使用ToList()，因为List<T>提供了比数组更多的功能，除非几个字节的内存确实对您很重要。

2011-01-11 23:10:46

我知道这是一个老帖子，但在有了同样的问题和做了一些研究之后，我发现了一些有趣的东西，可能值得分享。

首先，我同意@mquander和他的回答。在性能方面，两者是相同的。

但是，我一直在使用Reflector查看System.Linq.Enumerable扩展名称空间中的方法，并注意到一个非常常见的优化。只要可能，IEnumerable<T>源就转换为IList<T>或ICollection<T>来优化方法。例如，查看ElementAt(int)。

有趣的是，微软选择只优化IList<T>，而不是IList。微软似乎更喜欢使用IList<T>接口。

2010-07-12 19:55:40

如果在IEnumerable<T>(例如，来自ORM)上使用ToList()，则通常是首选。如果序列的长度在开始时不知道，ToArray()会创建动态长度的集合(如List)，然后将其转换为数组，这将花费额外的时间。

2010-02-01 14:55:21

在LINQ查询中调用ToList()或ToArray()更好吗?

推荐文章

最新文章

标签