我知道c#中实例化的值类型数组会自动填充该类型的默认值(例如bool为false, int为0,等等)。
是否有一种方法来自动填充一个不是默认的种子值的数组?无论是在创建或之后的内置方法(如Java的Arrays.fill())?假设我想要一个默认为true的布尔数组,而不是false。是否有一个内置的方法来做到这一点,或者你只需要通过一个for循环迭代数组?
// Example pseudo-code:
bool[] abValues = new[1000000];
Array.Populate(abValues, true);
// Currently how I'm handling this:
bool[] abValues = new[1000000];
for (int i = 0; i < 1000000; i++)
{
abValues[i] = true;
}
必须遍历数组并将每个值“重置”为true似乎效率不高。还有其他方法吗?也许通过翻转所有值?
在输入这个问题并思考之后,我猜默认值只是c#在幕后处理这些对象的内存分配的结果,所以我想这可能是不可能的。但我还是想确定一下!
这里给出的许多答案都可以归结为一个循环,每次初始化数组中的一个元素,它没有利用设计为一次操作内存块的CPU指令。
. net Standard 2.1(在撰写本文时的预览版中)提供了Array.Fill(),这有助于在运行时库中实现高性能(尽管到目前为止,. net Core似乎还没有利用这种可能性)。
For those on earlier platforms, the following extension method outperforms a trivial loop by a substantial margin when the array size is significant. I created it when my solution for an online code challenge was around 20% over the allocated time budget. It reduced the runtime by around 70%. In this case, the array fill was performed inside another loop. BLOCK_SIZE was set by gut feeling rather than experiment. Some optimizations are possible (e.g. copying all bytes already set to the desired value rather than a fixed-size block).
internal const int BLOCK_SIZE = 256;
public static void Fill<T>(this T[] array, T value)
{
if (array.Length < 2 * BLOCK_SIZE)
{
for (int i = 0; i < array.Length; i++) array[i] = value;
}
else
{
int fullBlocks = array.Length / BLOCK_SIZE;
// Initialize first block
for (int j = 0; j < BLOCK_SIZE; j++) array[j] = value;
// Copy successive full blocks
for (int blk = 1; blk < fullBlocks; blk++)
{
Array.Copy(array, 0, array, blk * BLOCK_SIZE, BLOCK_SIZE);
}
for (int rem = fullBlocks * BLOCK_SIZE; rem < array.Length; rem++)
{
array[rem] = value;
}
}
}
我有点惊讶没有人做了非常简单,但超快的SIMD版本:
public static void PopulateSimd<T>(T[] array, T value) where T : struct
{
var vector = new Vector<T>(value);
var i = 0;
var s = Vector<T>.Count;
var l = array.Length & ~(s-1);
for (; i < l; i += s) vector.CopyTo(array, i);
for (; i < array.Length; i++) array[i] = value;
}
基准测试:(数据来自于Framework 4.8,但Core3.1在统计上是相同的)
| Method | N | Mean | Error | StdDev | Ratio | RatioSD |
|----------- |-------- |---------------:|---------------:|--------------:|------:|--------:|
| DarthGizka | 10 | 25.975 ns | 1.2430 ns | 0.1924 ns | 1.00 | 0.00 |
| Simd | 10 | 3.438 ns | 0.4427 ns | 0.0685 ns | 0.13 | 0.00 |
| | | | | | | |
| DarthGizka | 100 | 81.155 ns | 3.8287 ns | 0.2099 ns | 1.00 | 0.00 |
| Simd | 100 | 12.178 ns | 0.4547 ns | 0.0704 ns | 0.15 | 0.00 |
| | | | | | | |
| DarthGizka | 1000 | 201.138 ns | 8.9769 ns | 1.3892 ns | 1.00 | 0.00 |
| Simd | 1000 | 100.397 ns | 4.0965 ns | 0.6339 ns | 0.50 | 0.00 |
| | | | | | | |
| DarthGizka | 10000 | 1,292.660 ns | 38.4965 ns | 5.9574 ns | 1.00 | 0.00 |
| Simd | 10000 | 1,272.819 ns | 68.5148 ns | 10.6027 ns | 0.98 | 0.01 |
| | | | | | | |
| DarthGizka | 100000 | 16,156.106 ns | 366.1133 ns | 56.6564 ns | 1.00 | 0.00 |
| Simd | 100000 | 17,627.879 ns | 1,589.7423 ns | 246.0144 ns | 1.09 | 0.02 |
| | | | | | | |
| DarthGizka | 1000000 | 176,625.870 ns | 32,235.9957 ns | 1,766.9637 ns | 1.00 | 0.00 |
| Simd | 1000000 | 186,812.920 ns | 18,069.1517 ns | 2,796.2212 ns | 1.07 | 0.01 |
可以看到,在小于10000个元素时速度要快得多,超过10000个元素时速度仅略慢。
我知道我来晚了,但我有个主意。编写一个包装器,其中包含与被包装值之间的转换操作符,以便它可以用作被包装类型的替身。这实际上是受到@l33t的愚蠢回答的启发。
首先(来自c++),我意识到在c#中,当数组的元素被构造时,默认的ctor是不被调用的。相反,即使存在用户定义的默认构造函数!——所有数组元素都是零初始化的。这确实让我大吃一惊。
因此,包装器类只提供一个默认的ctor和所需的值,就可以用于c++中的数组,但不适用于c#。一种解决方法是让包装器类型在转换时将0映射到所需的种子值。这样一来,在所有实际应用中,零初始化值似乎都被种子初始化了:
public struct MyBool
{
private bool _invertedValue;
public MyBool(bool b)
{
_invertedValue = !b;
}
public static implicit operator MyBool(bool b)
{
return new MyBool(b);
}
public static implicit operator bool(MyBool mb)
{
return !mb._invertedValue;
}
}
static void Main(string[] args)
{
MyBool mb = false; // should expose false.
Console.Out.WriteLine("false init gives false: "
+ !mb);
MyBool[] fakeBoolArray = new MyBool[100];
Console.Out.WriteLine("Default array elems are true: "
+ fakeBoolArray.All(b => b) );
fakeBoolArray[21] = false;
Console.Out.WriteLine("Assigning false worked: "
+ !fakeBoolArray[21]);
fakeBoolArray[21] = true;
// Should define ToString() on a MyBool,
// hence the !! to force bool
Console.Out.WriteLine("Assigning true again worked: "
+ !!fakeBoolArray[21]);
}
此模式适用于所有值类型。例如,如果需要初始化4,则可以将int类型的0映射到4。
我很想像在c++中那样做一个模板,提供种子值作为模板参数,但我知道这在c#中是不可能的。还是我遗漏了什么?(当然,在c++中,映射根本不是必需的,因为可以提供一个默认的ctor,它将被数组元素调用。)
FWIW,这里有一个等价的c++: https://ideone.com/wG8yEh。
这里给出的许多答案都可以归结为一个循环,每次初始化数组中的一个元素,它没有利用设计为一次操作内存块的CPU指令。
. net Standard 2.1(在撰写本文时的预览版中)提供了Array.Fill(),这有助于在运行时库中实现高性能(尽管到目前为止,. net Core似乎还没有利用这种可能性)。
For those on earlier platforms, the following extension method outperforms a trivial loop by a substantial margin when the array size is significant. I created it when my solution for an online code challenge was around 20% over the allocated time budget. It reduced the runtime by around 70%. In this case, the array fill was performed inside another loop. BLOCK_SIZE was set by gut feeling rather than experiment. Some optimizations are possible (e.g. copying all bytes already set to the desired value rather than a fixed-size block).
internal const int BLOCK_SIZE = 256;
public static void Fill<T>(this T[] array, T value)
{
if (array.Length < 2 * BLOCK_SIZE)
{
for (int i = 0; i < array.Length; i++) array[i] = value;
}
else
{
int fullBlocks = array.Length / BLOCK_SIZE;
// Initialize first block
for (int j = 0; j < BLOCK_SIZE; j++) array[j] = value;
// Copy successive full blocks
for (int blk = 1; blk < fullBlocks; blk++)
{
Array.Copy(array, 0, array, blk * BLOCK_SIZE, BLOCK_SIZE);
}
for (int rem = fullBlocks * BLOCK_SIZE; rem < array.Length; rem++)
{
array[rem] = value;
}
}
}