在c#中是否有默认/官方/推荐的方法来解析CSV文件?我不想滚动自己的解析器。

另外,我也见过人们使用ODBC/OLE DB通过文本驱动程序读取CSV的实例,很多人因为它的“缺点”而不鼓励这样做。这些缺点是什么?

理想情况下,我正在寻找一种方法,通过它我可以通过列名读取CSV,使用第一个记录作为报头/字段名。给出的一些答案是正确的,但基本上是将文件反序列化为类。


当前回答

让一个库为你处理所有的细节!: -)

检查FileHelpers和保持干燥-不重复自己-不需要重新发明轮子的亿万次....

基本上,您只需要定义数据的形状——CSV中各个行中的字段——通过一个公共类(以及诸如默认值、NULL值替换等经过精心考虑的属性),将FileHelpers引擎指向一个文件,然后就可以从该文件中获得所有条目。一个简单的操作-卓越的性能!

其他回答

CSV解析器现在是. net框架的一部分。

添加对Microsoft.VisualBasic.dll的引用(在c#中工作良好,不要介意名称)

using (TextFieldParser parser = new TextFieldParser(@"c:\temp\test.csv"))
{
    parser.TextFieldType = FieldType.Delimited;
    parser.SetDelimiters(",");
    while (!parser.EndOfData)
    {
        //Process row
        string[] fields = parser.ReadFields();
        foreach (string field in fields)
        {
            //TODO: Process field
        }
    }
}

文档在这里- TextFieldParser类

附注:如果你需要一个CSV导出器,试试CsvExport (discl:我是贡献者之一)

我已经为。net编写了TinyCsvParser,它是目前最快的。net解析器之一,并且高度可配置,可以解析几乎任何CSV格式。

它在MIT许可下发布:

https://github.com/bytefish/TinyCsvParser

你可以使用NuGet来安装它。在包管理器控制台中运行以下命令。

PM> Install-Package TinyCsvParser

使用

假设我们在CSV文件Persons . CSV中有一个person列表,包含他们的名字、姓氏和生日。

FirstName;LastName;BirthDate
Philipp;Wagner;1986/05/12
Max;Musterman;2014/01/02

系统中相应的域模型可能是这样的。

private class Person
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public DateTime BirthDate { get; set; }
}

当使用TinyCsvParser时,您必须定义CSV数据中的列和域模型中的属性之间的映射。

private class CsvPersonMapping : CsvMapping<Person>
{

    public CsvPersonMapping()
        : base()
    {
        MapProperty(0, x => x.FirstName);
        MapProperty(1, x => x.LastName);
        MapProperty(2, x => x.BirthDate);
    }
}

然后,我们可以使用映射使用CsvParser解析CSV数据。

namespace TinyCsvParser.Test
{
    [TestFixture]
    public class TinyCsvParserTest
    {
        [Test]
        public void TinyCsvTest()
        {
            CsvParserOptions csvParserOptions = new CsvParserOptions(true, new[] { ';' });
            CsvPersonMapping csvMapper = new CsvPersonMapping();
            CsvParser<Person> csvParser = new CsvParser<Person>(csvParserOptions, csvMapper);

            var result = csvParser
                .ReadFromFile(@"persons.csv", Encoding.ASCII)
                .ToList();

            Assert.AreEqual(2, result.Count);

            Assert.IsTrue(result.All(x => x.IsValid));

            Assert.AreEqual("Philipp", result[0].Result.FirstName);
            Assert.AreEqual("Wagner", result[0].Result.LastName);

            Assert.AreEqual(1986, result[0].Result.BirthDate.Year);
            Assert.AreEqual(5, result[0].Result.BirthDate.Month);
            Assert.AreEqual(12, result[0].Result.BirthDate.Day);

            Assert.AreEqual("Max", result[1].Result.FirstName);
            Assert.AreEqual("Mustermann", result[1].Result.LastName);

            Assert.AreEqual(2014, result[1].Result.BirthDate.Year);
            Assert.AreEqual(1, result[1].Result.BirthDate.Month);
            Assert.AreEqual(1, result[1].Result.BirthDate.Day);
        }
    }
}

用户指南

完整的用户指南可在以下网址查阅:

http://bytefish.github.io/TinyCsvParser/

如果任何人想要一个代码片段,他们可以直接输入自己的代码,而不必绑定库或下载包。以下是我写的一个版本:

    public static string FormatCSV(List<string> parts)
    {
        string result = "";

        foreach (string s in parts)
        {
            if (result.Length > 0)
            {
                result += ",";

                if (s.Length == 0)
                    continue;
            }

            if (s.Length > 0)
            {
                result += "\"" + s.Replace("\"", "\"\"") + "\"";
            }
            else
            {
                // cannot output double quotes since its considered an escape for a quote
                result += ",";
            }
        }

        return result;
    }

    enum CSVMode
    {
        CLOSED = 0,
        OPENED_RAW = 1,
        OPENED_QUOTE = 2
    }

    public static List<string> ParseCSV(string input)
    {
        List<string> results;

        CSVMode mode;

        char[] letters;

        string content;


        mode = CSVMode.CLOSED;

        content = "";
        results = new List<string>();
        letters = input.ToCharArray();

        for (int i = 0; i < letters.Length; i++)
        {
            char letter = letters[i];
            char nextLetter = '\0';

            if (i < letters.Length - 1)
                nextLetter = letters[i + 1];

            // If its a quote character
            if (letter == '"')
            {
                // If that next letter is a quote
                if (nextLetter == '"' && mode == CSVMode.OPENED_QUOTE)
                {
                    // Then this quote is escaped and should be added to the content

                    content += letter;

                    // Skip the escape character
                    i++;
                    continue;
                }
                else
                {
                    // otherwise its not an escaped quote and is an opening or closing one
                    // Character is skipped

                    // If it was open, then close it
                    if (mode == CSVMode.OPENED_QUOTE)
                    {
                        results.Add(content);

                        // reset the content
                        content = "";

                        mode = CSVMode.CLOSED;

                        // If there is a next letter available
                        if (nextLetter != '\0')
                        {
                            // If it is a comma
                            if (nextLetter == ',')
                            {
                                i++;
                                continue;
                            }
                            else
                            {
                                throw new Exception("Expected comma. Found: " + nextLetter);
                            }
                        }
                    }
                    else if (mode == CSVMode.OPENED_RAW)
                    {
                        // If it was opened raw, then just add the quote 
                        content += letter;
                    }
                    else if (mode == CSVMode.CLOSED)
                    {
                        // Otherwise open it as a quote 

                        mode = CSVMode.OPENED_QUOTE;
                    }
                }
            }
            // If its a comma seperator
            else if (letter == ',')
            {
                // If in quote mode
                if (mode == CSVMode.OPENED_QUOTE)
                {
                    // Just read it
                    content += letter;
                }
                // If raw, then close the content
                else if (mode == CSVMode.OPENED_RAW)
                {
                    results.Add(content);

                    content = "";

                    mode = CSVMode.CLOSED;
                }
                // If it was closed, then open it raw
                else if (mode == CSVMode.CLOSED)
                {
                    mode = CSVMode.OPENED_RAW;

                    results.Add(content);

                    content = "";
                }
            }
            else
            {
                // If opened quote, just read it
                if (mode == CSVMode.OPENED_QUOTE)
                {
                    content += letter;
                }
                // If opened raw, then read it
                else if (mode == CSVMode.OPENED_RAW)
                {
                    content += letter;
                }
                // It closed, then open raw
                else if (mode == CSVMode.CLOSED)
                {
                    mode = CSVMode.OPENED_RAW;

                    content += letter;
                }
            }
        }

        // If it was still reading when the buffer finished
        if (mode != CSVMode.CLOSED)
        {
            results.Add(content);
        }

        return results;
    }

对于较小的CSV输入数据,LINQ是完全足够的。 以以下CSV文件内容为例:

schema_name、描述utype “IX_HE”、“高能量数据”,“x” “III_spectro”、“Spectrosopic数据”、“d” “VI_misc”、“杂”、“f” “vcds1”,“目录只在cd上提供”,“d” “J_other”,“其他期刊发表的文章”,“b”

当我们将整个内容读入一个名为data的字符串时,则

using System;
using System.IO;
using System.Linq;

var data = File.ReadAllText(Path2CSV);

// helper split characters
var newline = Environment.NewLine.ToCharArray();
var comma = ",".ToCharArray();
var quote = "\"".ToCharArray();

// split input string data to lines
var lines = data.Split(newline);

// first line is header, take the header fields
foreach (var col in lines.First().Split(comma)) {
    // do something with "col"
}

// we skip the first line, all the rest are real data lines/fields
foreach (var line in lines.Skip(1)) {
    // first we split the data line by comma character
    // next we remove double qoutes from each splitted element using Trim()
    // finally we make an array
    var fields = line.Split(comma)
        .Select(_ => { _ = _.Trim(quote); return _; })
        .ToArray();
    // do something with the "fields" array
}

CsvHelper(我维护的一个库)将把CSV文件读入自定义对象。

using (var reader = new StreamReader("path\\to\\file.csv"))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
    var records = csv.GetRecords<Foo>();
}

有时候你并不拥有你想读的对象。在这种情况下,您可以使用流畅映射,因为您不能将属性放在类上。

public sealed class MyCustomObjectMap : CsvClassMap<MyCustomObject>
{
    public MyCustomObjectMap()
    {
        Map( m => m.Property1 ).Name( "Column Name" );
        Map( m => m.Property2 ).Index( 4 );
        Map( m => m.Property3 ).Ignore();
        Map( m => m.Property4 ).TypeConverter<MySpecialTypeConverter>();
    }
}