假设你需要有一个整数列表/数组,你需要频繁迭代,我的意思是非常频繁。原因可能各不相同,但可以说这是在一个大容量处理的最内部循环的核心。
通常情况下,人们会选择使用列表(List),因为它们在大小上具有灵活性。最重要的是,msdn文档声称列表在内部使用数组,并且应该执行得一样快(快速查看Reflector证实了这一点)。尽管如此,还是有一些开销。
有人测量过吗?在一个列表中迭代6M次是否与数组相同?
假设你需要有一个整数列表/数组,你需要频繁迭代,我的意思是非常频繁。原因可能各不相同,但可以说这是在一个大容量处理的最内部循环的核心。
通常情况下,人们会选择使用列表(List),因为它们在大小上具有灵活性。最重要的是,msdn文档声称列表在内部使用数组,并且应该执行得一样快(快速查看Reflector证实了这一点)。尽管如此,还是有一些开销。
有人测量过吗?在一个列表中迭代6M次是否与数组相同?
当前回答
我担心在其他答案中发布的基准测试仍然会为编译器留下优化,消除或合并循环的空间,所以我写了一个:
使用不可预测的输入(随机) 运行计算结果并将结果打印到控制台 每次重复修改输入数据
结果是,直接数组的性能比访问封装在IList中的数组要好250%:
10亿次数组访问:4000毫秒 10亿次列表访问:10000毫秒 1亿个数组访问:350毫秒 1亿次列表访问:1000毫秒
代码如下:
static void Main(string[] args) {
const int TestPointCount = 1000000;
const int RepetitionCount = 1000;
Stopwatch arrayTimer = new Stopwatch();
Stopwatch listTimer = new Stopwatch();
Point2[] points = new Point2[TestPointCount];
var random = new Random();
for (int index = 0; index < TestPointCount; ++index) {
points[index].X = random.NextDouble();
points[index].Y = random.NextDouble();
}
for (int repetition = 0; repetition <= RepetitionCount; ++repetition) {
if (repetition > 0) { // first repetition is for cache warmup
arrayTimer.Start();
}
doWorkOnArray(points);
if (repetition > 0) { // first repetition is for cache warmup
arrayTimer.Stop();
}
if (repetition > 0) { // first repetition is for cache warmup
listTimer.Start();
}
doWorkOnList(points);
if (repetition > 0) { // first repetition is for cache warmup
listTimer.Stop();
}
}
Console.WriteLine("Ignore this: " + points[0].X + points[0].Y);
Console.WriteLine(
string.Format(
"{0} accesses on array took {1} ms",
RepetitionCount * TestPointCount, arrayTimer.ElapsedMilliseconds
)
);
Console.WriteLine(
string.Format(
"{0} accesses on list took {1} ms",
RepetitionCount * TestPointCount, listTimer.ElapsedMilliseconds
)
);
}
private static void doWorkOnArray(Point2[] points) {
var random = new Random();
int pointCount = points.Length;
Point2 accumulated = Point2.Zero;
for (int index = 0; index < pointCount; ++index) {
accumulated.X += points[index].X;
accumulated.Y += points[index].Y;
}
accumulated /= pointCount;
// make use of the result somewhere so the optimizer can't eliminate the loop
// also modify the input collection so the optimizer can merge the repetition loop
points[random.Next(0, pointCount)] = accumulated;
}
private static void doWorkOnList(IList<Point2> points) {
var random = new Random();
int pointCount = points.Count;
Point2 accumulated = Point2.Zero;
for (int index = 0; index < pointCount; ++index) {
accumulated.X += points[index].X;
accumulated.Y += points[index].Y;
}
accumulated /= pointCount;
// make use of the result somewhere so the optimizer can't eliminate the loop
// also modify the input collection so the optimizer can merge the repetition loop
points[random.Next(0, pointCount)] = accumulated;
}
其他回答
static long[] longs = new long[500000];
static long[] longs2 = {};
static List<long> listLongs = new List<long> { };
static void Main(string[] args)
{
Console.CursorVisible = false;
Stopwatch time = new Stopwatch();
time.Start();
for (int f = 50000000; f < 50255000; f++)
{
listLongs.Add(f);
}
//List Time: 1ms Count : 255000
Console.WriteLine("List Time: " + time.ElapsedMilliseconds + " | Count: " + listLongs.Count());
time.Restart();
time.Start();
for (long i = 1; i < 500000; i++)
{
longs[i] = i * 200;
}
//Array Time: 2ms Length: 500000 (Unrealistic Data)
Console.WriteLine("Array Time: " + time.ElapsedMilliseconds + " | Length: " + longs.Length);
time.Restart();
time.Start();
for (int i = 50000000; i < 50055000; i++)
{
longs2 = longs2.Append(i).ToArray();
}
//Array Time: 17950ms Length: 55000
Console.WriteLine("Array Append Time: " + time.ElapsedMilliseconds + " | Length: " + longs2.Length);
Console.ReadLine();
}
Type | Time | Len |
---|---|---|
Array | 2ms | 500000 |
List | 1ms | 255000 |
Array Append | 17950ms | 55000 |
如果您计划不断地向数组中添加少量数据,那么list更快
这实际上取决于你将如何使用数组。
很容易测量…
在少量的紧循环处理代码中,我知道长度是固定的,我使用数组来进行额外的微小优化;如果你使用索引器/ for表单,数组可以稍微快一点——但是IIRC认为这取决于数组中数据的类型。但除非你需要进行微观优化,否则尽量保持简单,使用List<T>等。
当然,这只适用于读取所有数据的情况;对于基于键的查找,字典会更快。
下面是我使用“int”的结果(第二个数字是一个校验和,以验证它们都做了相同的工作):
(修改bug)
List/for: 1971ms (589725196)
Array/for: 1864ms (589725196)
List/foreach: 3054ms (589725196)
Array/foreach: 1860ms (589725196)
基于试验台:
using System;
using System.Collections.Generic;
using System.Diagnostics;
static class Program
{
static void Main()
{
List<int> list = new List<int>(6000000);
Random rand = new Random(12345);
for (int i = 0; i < 6000000; i++)
{
list.Add(rand.Next(5000));
}
int[] arr = list.ToArray();
int chk = 0;
Stopwatch watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = list.Count;
for (int i = 0; i < len; i++)
{
chk += list[i];
}
}
watch.Stop();
Console.WriteLine("List/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
for (int i = 0; i < arr.Length; i++)
{
chk += arr[i];
}
}
watch.Stop();
Console.WriteLine("Array/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in list)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine("List/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in arr)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine("Array/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);
Console.ReadLine();
}
}
我担心在其他答案中发布的基准测试仍然会为编译器留下优化,消除或合并循环的空间,所以我写了一个:
使用不可预测的输入(随机) 运行计算结果并将结果打印到控制台 每次重复修改输入数据
结果是,直接数组的性能比访问封装在IList中的数组要好250%:
10亿次数组访问:4000毫秒 10亿次列表访问:10000毫秒 1亿个数组访问:350毫秒 1亿次列表访问:1000毫秒
代码如下:
static void Main(string[] args) {
const int TestPointCount = 1000000;
const int RepetitionCount = 1000;
Stopwatch arrayTimer = new Stopwatch();
Stopwatch listTimer = new Stopwatch();
Point2[] points = new Point2[TestPointCount];
var random = new Random();
for (int index = 0; index < TestPointCount; ++index) {
points[index].X = random.NextDouble();
points[index].Y = random.NextDouble();
}
for (int repetition = 0; repetition <= RepetitionCount; ++repetition) {
if (repetition > 0) { // first repetition is for cache warmup
arrayTimer.Start();
}
doWorkOnArray(points);
if (repetition > 0) { // first repetition is for cache warmup
arrayTimer.Stop();
}
if (repetition > 0) { // first repetition is for cache warmup
listTimer.Start();
}
doWorkOnList(points);
if (repetition > 0) { // first repetition is for cache warmup
listTimer.Stop();
}
}
Console.WriteLine("Ignore this: " + points[0].X + points[0].Y);
Console.WriteLine(
string.Format(
"{0} accesses on array took {1} ms",
RepetitionCount * TestPointCount, arrayTimer.ElapsedMilliseconds
)
);
Console.WriteLine(
string.Format(
"{0} accesses on list took {1} ms",
RepetitionCount * TestPointCount, listTimer.ElapsedMilliseconds
)
);
}
private static void doWorkOnArray(Point2[] points) {
var random = new Random();
int pointCount = points.Length;
Point2 accumulated = Point2.Zero;
for (int index = 0; index < pointCount; ++index) {
accumulated.X += points[index].X;
accumulated.Y += points[index].Y;
}
accumulated /= pointCount;
// make use of the result somewhere so the optimizer can't eliminate the loop
// also modify the input collection so the optimizer can merge the repetition loop
points[random.Next(0, pointCount)] = accumulated;
}
private static void doWorkOnList(IList<Point2> points) {
var random = new Random();
int pointCount = points.Count;
Point2 accumulated = Point2.Zero;
for (int index = 0; index < pointCount; ++index) {
accumulated.X += points[index].X;
accumulated.Y += points[index].Y;
}
accumulated /= pointCount;
// make use of the result somewhere so the optimizer can't eliminate the loop
// also modify the input collection so the optimizer can merge the repetition loop
points[random.Next(0, pointCount)] = accumulated;
}
不要试图通过增加元素数量来增加容量。
性能
List For Add: 1ms
Array For Add: 2397ms
Stopwatch watch;
#region --> List For Add <--
List<int> intList = new List<int>();
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 60000; rpt++)
{
intList.Add(rand.Next());
}
watch.Stop();
Console.WriteLine("List For Add: {0}ms", watch.ElapsedMilliseconds);
#endregion
#region --> Array For Add <--
int[] intArray = new int[0];
watch = Stopwatch.StartNew();
int sira = 0;
for (int rpt = 0; rpt < 60000; rpt++)
{
sira += 1;
Array.Resize(ref intArray, intArray.Length + 1);
intArray[rpt] = rand.Next();
}
watch.Stop();
Console.WriteLine("Array For Add: {0}ms", watch.ElapsedMilliseconds);
#endregion
简短的回答:
在。net List中<T>和Array<T>具有相同的速度/性能,因为在。net List中是Array的包装器。
再说一遍:List在里面是数组!在。net List中<T>是其他语言中的<T>数组列表。
详细说明在哪些情况下需要使用什么:
Array need to use: So often as possible. It's fast and takes smallest RAM range for same amount information. If you know exact count of cells needed If data saved in array < 85000 b (85000/32 = 2656 elements for integer data) If needed high Random Access speed List need to use: If needed to add cells to the end of list (often) If needed to add cells in the beginning/middle of the list (NOT OFTEN) If data saved in array < 85000 b (85000/32 = 2656 elements for integer data) If needed high Random Access speed LinkedList need to use: If needed to add cells in the beginning/middle/end of the list (often) If needed only sequential access (forward/backward) If you need to save LARGE items, but items count is low. Better do not use for large amount of items, as it's use additional memory for links. If you not sure that you need LinkedList -- YOU DON'T NEED IT. Just do not use it.
更多的细节:
更多细节:
https://stackoverflow.com/a/29263914/4423545