如何在c#中读取和解析XML文件?


当前回答

There are different ways, depending on where you want to get. XmlDocument is lighter than XDocument, but if you wish to verify minimalistically that a string contains XML, then regular expression is possibly the fastest and lightest choice you can make. For example, I have implemented Smoke Tests with SpecFlow for my API and I wish to test if one of the results in any valid XML - then I would use a regular expression. But if I need to extract values from this XML, then I would parse it with XDocument to do it faster and with less code. Or I would use XmlDocument if I have to work with a big XML (and sometimes I work with XML's that are around 1M lines, even more); then I could even read it line by line. Why? Try opening more than 800MB in private bytes in Visual Studio; even on production you should not have objects bigger than 2GB. You can with a twerk, but you should not. If you would have to parse a document, which contains A LOT of lines, then this documents would probably be CSV.

我写这条评论,是因为我看到了大量使用XDocument的示例。XDocument不适用于大型文档,或者当您只想验证内容是否为XML有效时。如果希望检查XML本身是否有意义,那么需要Schema。

我对建议的答案也投了反对票,因为我相信它本身就需要上述信息。假设我需要验证200M的XML是否有效,每小时10次。XDocument会浪费大量的资源。

prasanna venkatesh还指出,您可以尝试将字符串填充到数据集,它也将指示有效的XML。

其他回答

public void ReadXmlFile()
{
    string path = HttpContext.Current.Server.MapPath("~/App_Data"); // Finds the location of App_Data on server.
    XmlTextReader reader = new XmlTextReader(System.IO.Path.Combine(path, "XMLFile7.xml")); //Combines the location of App_Data and the file name
    while (reader.Read())
    {
        switch (reader.NodeType)
        {
            case XmlNodeType.Element:
                break;
            case XmlNodeType.Text:
                columnNames.Add(reader.Value);
                break;
            case XmlNodeType.EndElement:
                break;
        }
    }
}

可以避免使用第一个语句,只在XmlTextReader的构造函数中指定路径名。

您可以使用数据集读取XML字符串。

var xmlString = File.ReadAllText(FILE_PATH);
var stringReader = new StringReader(xmlString);
var dsSet = new DataSet();
dsSet.ReadXml(stringReader);

发布这篇文章是为了提供信息。

下面是另一种使用Cinchoo ETL的方法,它是一个开源库,可以用很少的代码行解析xml文件。

using (var r = ChoXmlReader<Item>.LoadText(xml)
       .WithXPath("//item")
      )
{
    foreach (var rec in r)
        rec.Print();
}

public class Item
{
    public string Name { get; set; }
    public string ProtectionLevel { get; set; }
    public string Description { get; set; }
}

样本提琴:https://dotnetfiddle.net/otYq5j

免责声明:我是这个库的作者。

例如,检查XmlTextReader类。

There are different ways, depending on where you want to get. XmlDocument is lighter than XDocument, but if you wish to verify minimalistically that a string contains XML, then regular expression is possibly the fastest and lightest choice you can make. For example, I have implemented Smoke Tests with SpecFlow for my API and I wish to test if one of the results in any valid XML - then I would use a regular expression. But if I need to extract values from this XML, then I would parse it with XDocument to do it faster and with less code. Or I would use XmlDocument if I have to work with a big XML (and sometimes I work with XML's that are around 1M lines, even more); then I could even read it line by line. Why? Try opening more than 800MB in private bytes in Visual Studio; even on production you should not have objects bigger than 2GB. You can with a twerk, but you should not. If you would have to parse a document, which contains A LOT of lines, then this documents would probably be CSV.

我写这条评论,是因为我看到了大量使用XDocument的示例。XDocument不适用于大型文档,或者当您只想验证内容是否为XML有效时。如果希望检查XML本身是否有意义,那么需要Schema。

我对建议的答案也投了反对票,因为我相信它本身就需要上述信息。假设我需要验证200M的XML是否有效,每小时10次。XDocument会浪费大量的资源。

prasanna venkatesh还指出,您可以尝试将字符串填充到数据集,它也将指示有效的XML。