3 methods of parsing XML in C#
C# and .Net are rich in methods to complete the task. With each new version we have new classes and methods. Reading and parsing XML is not an exclusion. Below I’ll tell you about 3 methods for reading XML-data in C#.
First of all, we have an XML file and it’s name is "companies.xml". It’s a list of companies and their properties. As you can see, some items have different kinds of properties.
<?xml version="1.0" encoding="utf-8" standalone="yes"?> <companies> <header> <version>1.0</version> </header> <company> <cmp_code_nsd>AVMI</cmp_code_nsd> <full_name>Avocet Mining PLC</full_name> <short_name>Avocet Mining PLC</short_name> <full_name_en>Avocet Mining PLC</full_name_en> <short_name_en>Avocet Mining PLC</short_name_en> <country>GBR</country> <address>3 Floor, 30 Haymarket, London SW1Y 4EX, United Kingdom</address> <post_address>3 Floor, 30 Haymarket, London SW1Y 4EX, United Kingdom</post_address> <phone>+44 20 7766 7676</phone> <fax>+44 20 7766 7699</fax> <www>is www.avocet.co.uk.</www> </company> <company> <cmp_code_nsd>PIF12904</cmp_code_nsd> <full_name>PowerShares DB Silver Fund</full_name> <short_name>PowerShares DB Silver Fund</short_name> <full_name_en>PowerShares DB Silver Fund</full_name_en> <short_name_en>PowerShares DB Silver Fund</short_name_en> <country>USA</country> </company> <company> <cmp_code_nsd>ZUHL</cmp_code_nsd> <full_name>Zublin Immobilien Holding AG</full_name> <short_name>Zublin Immobilien Holding</short_name> <full_name_en>Zublin Immobilien Holding AG</full_name_en> <short_name_en>Zublin Immobilien Holding</short_name_en> <country>CHE</country> <address>Claridenstrasse 20, CH-8002 Zurich, Switzerland</address> <post_address>Claridenstrasse 20, CH-8002 Zurich, Switzerland</post_address> <phone>+41 (0)44 206 29 39</phone> <e_mail>info@zueblin.ch</e_mail> <www>www.zueblin.ch</www> </company> </companies>
Method 1. Parse XML using XmlDocument
XmlDocument is presented in .Net Framework since 3.5 in Assembly "System.Xml (System.Xml .dll)".
Below there’s a code to display a field "short_name" of company. In this code we load the XML document from the file using method "Load". After we set the value of "xpath" variable, it’s based on a structure of XML document, in our case it’s "companies/company". Then we get a list of nodes that respond to the set xpath and send them to LiteralControl.
String path = "c:\XMLFile1.xml"; XmlDocument xmlDoc = new XmlDocument(); xmlDoc.Load(path); string xpath = "companies/company"; var nodes = xmlDoc.SelectNodes(xpath); //Add literal control to display html-code LiteralControl lt1 = new LiteralControl(); form1.Controls.Add(lt1); int i = 0; foreach (XmlNode childrenNode in nodes) { XmlNode subNode = childrenNode.SelectNodes("//short_name")[i]; lt1.Text += subNode.FirstChild.InnerText + "<br />"; i++; }
In this case we need to write each property to display and thinks about exceptions. It the value of a property is "False" than we have exception. The error is something like this:
One of solutions is to use "try-catch" in a cycle:
foreach (XmlNode childrenNode in nodes) { try { XmlNode subNode = childrenNode.SelectNodes("//short_name")[i]; lt1.Text += subNode.FirstChild.InnerText + "<br />"; } catch { } i++; }
Method 2 – Parse XML using XmlReader
XmlReader is also presented in System.Xml assembly since .Net Framework 3.5.
Using XML-reader you can find an element in your XML document and make some actions with it. Using method "ReadToFollowing(String)" you can read XML until an element with the specified qualified name is found and, for example, to add it to string variable using "ReadElementContentAsString". There are a lot of ReadElementAsXXX, it’s described on MSDN page .
String path = "c:\XMLFile1.xml"; //Add literal control to display html-code LiteralControl lt1 = new LiteralControl(); form1.Controls.Add(lt1); using (XmlReader reader = XmlReader.Create(path)) { while (reader.ReadToFollowing("short_name") != false) { lt1.Text += reader.ReadElementContentAsString(); lt1.Text += "<br />"; } }
This method is nice if you need to get not a lot of field values from XML document. It’s rather quick to write a code like this and to use it. But I don’t recommend you to use this method for XML documents with difficult structure.
Method 3 – Parse XML using XmlSerializer
XmlSerializer is the most powerful tool to parse XML in C#. This Class exists in namespace System.Xml.Serialization and in assembly System.Xml.dll since .Net Framework 3.5.
In MSDN (http://msdn.microsoft.com/ru-ru/library/system.xml.serialization.xmlserializer(v=vs.110).aspx) there are good descriptions of what serialization and deserialization are
XML serialization is the process of converting an object's public properties and fields to a serial format (in this case, XML) for storage or transport.
Deserialization re-creates the object in its original state from the XML output. You can think of serialization as a way of saving the state of an object into a stream or buffer.
To use deserialization, at first you need to create classes according to your XML document.
[Serializable, XmlRoot("companies")] public class Companies { [XmlElement("company")] public List<Company> companies; } public class Company { [XmlElement ("cmp_code_nsd")] public string cmp_code_nsd {get; set;} [XmlElement ("full_name")] public string full_name {get; set;} [XmlElement ("short_name")] public string short_name {get; set;} [XmlElement("common_name_full_en")] public string common_name_full_en {get; set;} [XmlElement("country")] public string country { get; set; } [XmlElement("address")] public string address { get; set; } [XmlElement("post_address")] public string post_address { get; set; } [XmlElement("phone")] public string phone { get; set; } [XmlElement("e_mail")] public string e_mail { get; set; } [XmlElement("www")] public string www { get; set; } }
I think it’s the most time-consuming process.
String path = "c:\XMLFile1.xml"; StreamReader file = new StreamReader(path); XmlSerializer qreader = new XmlSerializer(typeof(Companies)); Companies mc = (Companies)qreader.Deserialize(file) as Companies; GridView gv = new GridView(); form1.Controls.Add(gv); gv.DataSource = mc.companies; gv.DataBind();
As a result, you have a table with data of your XML document.
If XML document has nested nodes, you just need to create more classes and arrays or lists.
For example, let’s add code in "<codes>" to one of <company> items.
<codes> <code> <code_type_mn>MICEX_FOND</code_type_mn> <code>RU000A0JRJC6</code> </code> </codes>
Add to class Company this code:
[XmlArrayItem("code", typeof(Code))] [XmlArray("codes")] public List<Code> codes { get; set; }
And create class named "Code"
[Serializable] public class Code { [XmlElement("code_type_mn")] public string code_type_mn { get; set; } [XmlElement("code")] public string code { get; set; } }