我想逐行读取文本文件。我想知道我是否在。net c#范围内尽可能高效地完成它。

这是我目前正在尝试的:

var filestream = new System.IO.FileStream(textFilePath,
                                          System.IO.FileMode.Open,
                                          System.IO.FileAccess.Read,
                                          System.IO.FileShare.ReadWrite);
var file = new System.IO.StreamReader(filestream, System.Text.Encoding.UTF8, true, 128);

while ((lineOfText = file.ReadLine()) != null)
{
    //Do something with the lineOfText
}

当前回答

使用以下代码:

foreach (string line in File.ReadAllLines(fileName))

这是阅读表现的巨大差异。

这是以内存消耗为代价的,但完全值得!

其他回答

在Stack Overflow的问题中有一个关于这个问题的好话题,“收益返回”比“老派”返回慢吗?

它说:

ReadAllLines loads all of the lines into memory and returns a string[]. All well and good if the file is small. If the file is larger than will fit in memory, you'll run out of memory. ReadLines, on the other hand, uses yield return to return one line at a time. With it, you can read any size file. It doesn't load the whole file into memory. Say you wanted to find the first line that contains the word "foo", and then exit. Using ReadAllLines, you'd have to read the entire file into memory, even if "foo" occurs on the first line. With ReadLines, you only read one line. Which one would be faster?

虽然file . readalllines()是读取文件的最简单方法之一,但它也是最慢的方法之一。

如果你只是想读取文件中的行而不做太多事情,根据这些基准测试,读取文件的最快方法是古老的方法:

using (StreamReader sr = File.OpenText(fileName))
{
        string s = String.Empty;
        while ((s = sr.ReadLine()) != null)
        {
               //do minimal amount of work here
        }
}

然而,如果你必须对每一行做很多事情,那么本文得出的结论是,最好的方法是以下(如果你知道你要读取多少行,那么预先分配一个字符串[]会更快):

AllLines = new string[MAX]; //only allocate memory here

using (StreamReader sr = File.OpenText(fileName))
{
        int x = 0;
        while (!sr.EndOfStream)
        {
               AllLines[x] = sr.ReadLine();
               x += 1;
        }
} //Finished. Close the file

//Now parallel process each line in the file
Parallel.For(0, AllLines.Length, x =>
{
    DoYourStuff(AllLines[x]); //do your work here
});

如果你正在使用。net 4,只需使用File即可。ReadLines为你做了所有这些。我怀疑它和你的差不多,除了它也可以使用FileOptions。SequentialScan和一个更大的缓冲区(128看起来很小)。

如果您有足够的内存,我发现通过将整个文件读入内存流,然后打开流阅读器来读取行,可以获得一些性能提升。只要您实际上打算读取整个文件,这就可以产生一些改进。

使用以下代码:

foreach (string line in File.ReadAllLines(fileName))

这是阅读表现的巨大差异。

这是以内存消耗为代价的,但完全值得!