在c#中读取一个大文件到字节数组的最佳方法?

我有一个网络服务器，它将读取大二进制文件(几兆字节)到字节数组。服务器可能同时读取多个文件(不同的页面请求)，因此我正在寻找一种最优化的方式来执行此操作，而不会对CPU造成太多负担。下面的代码足够好吗?

public byte[] FileToByteArray(string fileName)
{
    byte[] buff = null;
    FileStream fs = new FileStream(fileName, 
                                   FileMode.Open, 
                                   FileAccess.Read);
    BinaryReader br = new BinaryReader(fs);
    long numBytes = new FileInfo(fileName).Length;
    buff = br.ReadBytes((int) numBytes);
    return buff;
}

当前回答

我建议尝试Response.TransferFile()方法，然后使用Response.Flush()和Response.End()来提供大文件。

2010-01-19 23:37:55

其他回答

我可能会说，这里的答案通常是“不”。除非你绝对需要一次性获得所有数据，否则可以考虑使用基于流的API(或者reader / iterator的一些变体)。当您有多个并行操作(正如问题所建议的)以最小化系统负载和最大化吞吐量时，这一点尤其重要。

例如，如果您正在向调用者传输数据:

Stream dest = ...
using(Stream source = File.OpenRead(path)) {
    byte[] buffer = new byte[2048];
    int bytesRead;
    while((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0) {
        dest.Write(buffer, 0, bytesRead);
    }
}

2010-01-08 21:44:33

我建议尝试Response.TransferFile()方法，然后使用Response.Flush()和Response.End()来提供大文件。

2010-01-19 23:37:55

In case with 'a large file' is meant beyond the 4GB limit, then my following written code logic is appropriate. The key issue to notice is the LONG data type used with the SEEK method. As a LONG is able to point beyond 2^32 data boundaries. In this example, the code is processing first processing the large file in chunks of 1GB, after the large whole 1GB chunks are processed, the left over (<1GB) bytes are processed. I use this code with calculating the CRC of files beyond the 4GB size. (using https://crc32c.machinezoo.com/ for the crc32c calculation in this example)

private uint Crc32CAlgorithmBigCrc(string fileName)
{
    uint hash = 0;
    byte[] buffer = null;
    FileInfo fileInfo = new FileInfo(fileName);
    long fileLength = fileInfo.Length;
    int blockSize = 1024000000;
    decimal div = fileLength / blockSize;
    int blocks = (int)Math.Floor(div);
    int restBytes = (int)(fileLength - (blocks * blockSize));
    long offsetFile = 0;
    uint interHash = 0;
    Crc32CAlgorithm Crc32CAlgorithm = new Crc32CAlgorithm();
    bool firstBlock = true;
    using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
    {
        buffer = new byte[blockSize];
        using (BinaryReader br = new BinaryReader(fs))
        {
            while (blocks > 0)
            {
                blocks -= 1;
                fs.Seek(offsetFile, SeekOrigin.Begin);
                buffer = br.ReadBytes(blockSize);
                if (firstBlock)
                {
                    firstBlock = false;
                    interHash = Crc32CAlgorithm.Compute(buffer);
                    hash = interHash;
                }
                else
                {
                    hash = Crc32CAlgorithm.Append(interHash, buffer);
                }
                offsetFile += blockSize;
            }
            if (restBytes > 0)
            {
                Array.Resize(ref buffer, restBytes);
                fs.Seek(offsetFile, SeekOrigin.Begin);
                buffer = br.ReadBytes(restBytes);
                hash = Crc32CAlgorithm.Append(interHash, buffer);
            }
            buffer = null;
        }
    }
    //MessageBox.Show(hash.ToString());
    //MessageBox.Show(hash.ToString("X"));
    return hash;
}

2019-04-26 04:16:45

如果您正在处理大于2 GB的文件，您会发现上述方法失败。

直接将流交给MD5并允许它为你分块文件要容易得多:

private byte[] computeFileHash(string filename)
{
    MD5 md5 = MD5.Create();
    using (FileStream fs = new FileStream(filename, FileMode.Open))
    {
        byte[] hash = md5.ComputeHash(fs);
        return hash;
    }
}

2014-10-20 09:56:23

我想说BinaryReader很好，但可以重构成这样，而不是所有那些获取缓冲区长度的代码行:

public byte[] FileToByteArray(string fileName)
{
    byte[] fileData = null;

    using (FileStream fs = File.OpenRead(fileName)) 
    { 
        using (BinaryReader binaryReader = new BinaryReader(fs))
        {
            fileData = binaryReader.ReadBytes((int)fs.Length); 
        }
    }
    return fileData;
}

应该比使用. readallbytes()更好，因为我在包括. readallbytes()在内的顶部响应的评论中看到，其中一个评论者对文件> 600 MB有问题，因为BinaryReader是为这类事情准备的。此外，将它放在using语句中可以确保FileStream和BinaryReader被关闭和销毁。

2016-10-12 00:18:24

在c#中读取一个大文件到字节数组的最佳方法?

推荐文章

最新文章

标签