LINQ to XML ― brings new reasons to use more XML

I recently did a small talk about the benefits of using new XML API “LINQ to XML”. According to MSDN:

LINQ to XML provides an in-memory XML programming interface that leverages the .NET Language-Integrated Query (LINQ) Framework. LINQ to XML uses the latest .NET Framework language capabilities and is comparable to an updated, redesigned Document Object Model (DOM) XML programming interface.
The talk wealso requirednt very well, and luckily I managed to get the attention of the audience, because the approach I adopted was a little different than the normal. Instead of talking plainly about the new functions and properties, I rather tried to draw a comparison between the way we deal with the XML using existing and new API. I also shared why VB developers are more excited about this API than C# guys, and what is making them feel more privileged.

The core functionality of the new API revolves around 3 key concepts.

  1. Functional Construction
  2. Context-Free XML creation
  3. Simplified NameSpaces

Functional Construction:

The ability to create the entire XML tree or part of it by just using one statement. If you are someone like me who doesn’t play with XML day-in and day-out, you have to probably recall for a second how do you create XML and using which API. This is so true because the depth and breadth of XML API choices available to us today is overwhelming. For example,

  • XMLTextReader: for low-level parsing of XML documents.
  • XMLTextWriter: fast, non-cached, forward-only way of generating XML
  • XMLReader: read-only, forward-only API generally used to deal with large XML documents.
  • XMLDocument, XMLNode and XPathNavigator etc. etc.

So, if I want to create the below book XML in my application, I can either use XmlTextWriter.WriteStartElement() or I can also use XMLDocument.CreateNode() If document manipulation is also required.



<books>

<book>

<title>Essential .NET</title>

<author>Don Box</author>


<author>Chris Sells</author>

<publisher>Addison-Wesley</publisher>

</book>

</books>

Though, there is nothing wrong with both of the approaches mentioned above except the fact that they take more lines of code and more time as well just to churn out a tiny piece of XML. LINQ to XML aims to solve this problem by introducing XElement which takes params array as a parameter in one of its constructors allowing us to write entire XML tree in one statement.



XElement elements = new XElement("books",

new XElement("book",
new XElement("title", "Essential .NET"),
new XElement("author", "Don Box"),
new XElement("author", "Chris Sells"),
new XElement-W("publisher", "Addisonesley")
)
);

Context-Free XML creation:

When creating XML using DOM, everything has to be in context of parent document. This document-centric approach for creating XML results in code hard to read, write and debug. In LINQ to XML, attributes have been given first-class status. So, rather than going through factory methods to create elements and attributes, we can use compositional constructors offered by XElement and XAttribute class.

If I want to add an ISBN number as an attribute to the book element in the above book XML, I can simply write:



XElement elements = new XElement("books",
new XElement("book", new XAttribute(“ISBN”, “0201734117”),
new XElement("title", "Essential .NET"),
new XElement("author", "Don Box"),
new XElement("author", "Chris Sells"),
new XElement("publisher", "Addison-Wesley")
)
);

Simplified Namespaces:

I believe this is the most confusing aspect of XML. With the existing set of API, we have to remember many things like XML names, NameSpaces, prefixes associated with the NameSpaces, Namespace managers etc. LINQ to XML allows us to forget everything else and just focus on one thing called “fully expanded name” which is represented by XName class.

Let’s see how this new functionality differs from the existing one by taking an example of RSS feed of my blog. In the RSS document, which can be accessed from here http://feeds.feedburner.com/feed-irfan (right click → view source), I am interested in “totalResults” element which is prefixed by “openSearch”. This is how I do it using XMLNameSpaceManager which has been part of the .NET framework for a long time.



XmlDocument rss = new XmlDocument();

rss.Load("http://feeds.feedburner.com/feed-irfan");

XmlNamespaceManager nsManager = new XmlNamespaceManager(rss.NameTable);

nsManager.AddNamespace("openSearch", "http://a9.com/-/spec/opensearchrss/1.0/");

XmlNodeList list = rss.SelectNodes("//openSearch:totalResults", nsManager);

foreach (XmlNode node in list)
{
Console.WriteLine(node.InnerXml);
Console.ReadLine();
}


You can see I have to create XMLNameSpaceManager, add a namespace, remember the syntax of the query, provide the manager as a parameter…huh...too much of work. LINQ to XML says, forget about XMLNameSpaceManager, and create a fully expanded name and use it every time.



XElement rss = XElement.Load("http://feeds.feedburner.com/feed-irfan");

XNamespace ns = "http://a9.com/-/spec/opensearchrss/1.0/";

IEnumerable<XElement> items = rss.Descendants(ns + "totalResults");

foreach (XElement element in items)
{
Console.WriteLine(element.Value);
Console.ReadLine();
}

We can also take a look at how exactly we can load, create and update XML using LINQ to XML API.

Loading XML


  • Loading from URL:
    XElement feed = XElement.Load("http://feeds.feedburner.com/feed-irfan");

  • Loading from file:
    XElement file = XElement.Load(@"book.xml");

  • Loading from String:
    XElement document = XElement.Parse("<books><book><title>Essential.NET</title><author>Don Box</author><author>Chris Sells</author><publisher>Addison-Wesley</publisher></book></books>");

  • Loading from a reader:
    using (XmlReader xReader = XmlReader.Create(@"book.xml"))
    {
    while (xReader.Read())
    {
    if (xReader.NodeType == XmlNodeType.Element)
    break;
    }
    XElement messages = (XElement)XNode.ReadFrom(xReader);
    Console.WriteLine(messages);
    Console.ReadLine();
    }

  • XDocument:
    You may wonder If for every kind of load we use XElement, what is then the purpose of XDocument then? XDocument can be used whenever we require additional details about the document e.g. document type definition(DTD), document declaration etc. These are details which XElement doesn’t seem to provide.

Creating XML

Functional construction key concept that I mentioned above, defines the way XML is created using LINQ to XML. We have also seen above how to create an XML tree with fully qualified names. We can now take a look at how to associate a prefix with a namespace while creating an XML document.

Associating prefixes is just a matter of creating an XAttribute with appropriate values in the constructor and supplying it to XElemennt prefix is going to be associated with.



XNamespace ns = "http://www.essential.net"

var xml2 = new XElement("books",
new XElement(ns + "book", new XAttribute(XNamespace.Xmlns + "pre", ns),
new XElement("title", "Essential .NET"),
new XElement("author", "Don Box"),
new XElement("publisher", "Addison-Wesley")
)
);

XML Literals

As I mentioned in the beginning of this post that there is something in this API exclusively for VB.NET 9.0(+) developers. It is a new offering called “XML Literal” that enables developers to embed XML directly within VB.NET code. We have seen how to create book XML using Functional Construction above. Let’s now see how the same can be done using XML Literal:



Dim bookXML As XElement = <books>
<book>
<title>Essential .NET</title>
<author>Don Box</author>
<author>Chris Sells</author>
<publisher>Addison-Wesley</publisher>
</book>
</books>

bookXML.Save("book.xml", SaveOptions.None)

Rather than creating LINQ to XML object hierarchies that represent XML, VB guys instead can define the entire XML using XML syntax. And, If they want to make it more dynamic, they can also use ASP.NET code nuggets (<%= %>) which is called “expression holes” to embed the dynamic values into XML Literals.



Private Sub GetBookXML(ByVal bookName As String, ByVal publisher As String, ByVal ParamArray authors As String())

Dim customAttrib = "ISBN"
Dim bookXML As XElement = <books>
<book <%= customAttrib %>=<%= "0201734117" %>>
<title><%= bookName %></title>
<author><%= authors(0) %></author>
<author><%= authors(1) %></author>
<publisher><%= publisher %></publisher>
</book>
</books>

bookXML.Save("book.xml", SaveOptions.None)

End Sub

XML Axis Properties

Another unique feature which is available only in VB.NET 9.0 is “XML Axis properties”, which allows XML axis methods to be called using more compact syntax. Let’s take a look at those properties


  1. Child Axis Property
    This property allows all the child elements to return with a particular name. For example, I am looking for <author> element in my book XML. Using Child Axis Property I can directly say:
    Dim authorName as String = bookXML.<book>.<author>(0).Value 

    And, If you are interested in all the authors:
    Dim authors As IEnumerable(Of XElement) = bookXML.<book>.<author>
    Dim authors As List(Of String) = (From author As XElement In authors _
    Select author.Value).ToList()

  2. Descendent Axis Property
    It returns all the decedent elements that have the qualified name that is specified within the angle brackets. To see how it works, we’ll use the XML we produced using GetBookXML() method in XML Literal section explained above as input.


    Dim elements As IEnumerable(Of XElement) = bookXML. . .<title>.Where(Function(t) CInt(t.@ISDN) > 1)
    For Each e As XElement In elements
    Console.WriteLine(e.Value)
    Next

  3. Attribute Axis Property
    This property returns the string value of the attribute that has the qualified name that is specified after the “@” character.
    We have already seen an example of this property in the previous “Descendent property” section where we tried to get all the book titles by providing their ISDN property values to the WHERE clause.
    Another example could be a tiny piece of code that returns all the ISDN number in the entire bookXML document that we saved earlier.

    Dim ISDNList As New List(Of String)
    Dim elements As IEnumerable(Of XElement) = bookXML. . .<title>

    For Each e As XElement In elements
    ISDNList.Add(e.@ISDN.Value)
    Next

XML axis properties help a great deal in searching in XML documents. By having this shorthand syntax for accessing the primary XML axes, Visual Basic developers can stay focused on the XML they are trying to consume. As I said earlier, for the developers who deal with XML everyday, learning and understanding XPath is not a problem. However, for those like me who use XML rarely, Axis properties being a no-brainer has more attraction.


HTH,

0 comments:

Post a Comment