In this article, Scott examines how to work with LINQ using XML. He also demonstrates how to build a custom RSS Feed Reader using these technologies.
Introduction
Republished with Permission - Original Article
One of the big programming model improvements being made in .NET 3.5 is the work being done to make querying data a first class programming concept. We call this overall querying programming model "LINQ", which stands for .NET Language Integrated Query.
LINQ supports a rich extensibility model that facilitates the creation of efficient domain-specific providers for data sources. .NET 3.5 ships with built-in libraries that enable LINQ support against Objects, XML, and Databases.
What is LINQ to XML?
LINQ to XML is a built-in LINQ data provider that is implemented within the "System.Xml.Linq" namespace in .NET 3.5.
LINQ to XML provides a clean programming model that enables you to read, construct and write XML data. You can use LINQ to XML to perform LINQ queries over XML that you retrieve from the file-system, from a remote HTTP URL or web-service, or from any in-memory XML content.
LINQ to XML provides much richer (and easier) querying and data shaping support than the low-level XmlReader/XmlWriter API in .NET today. It also ends up being much more efficient (and uses much less memory) than the DOM API that XmlDocument provides.
Using LINQ to XML to query a local XML File
To get a sense of how LINQ to XML works, we can create a simple XML file on our local file-system like below that uses a custom schema we've defined to store RSS feeds:VB:
Once I have an XDocument object for my XML file I can then write a LINQ query expression to retrieve the XML data I'm looking for. In the code above I'm querying over each of the elements within the XML file. This is driven by this opening clause in the LINQ query expression:
from feed in feedXML.Decedents("Feed")
I'm then applying a filter that only returns back those "Feed" elements that either don't have a "status" attribute, or whose "status" attribute value is not set to "disabled":
Where (feed.Attribute("status") Is Nothing) OrElse (feed.Attribute("status").Value "disabled")
I am then using the select clause in our LINQ expression to indicate what data I want returned. If I simply wrote "select feed", LINQ to XML would return back a sequence of XElement objects that represents each of the XML element nodes that match my filter. In the code samples above, though, I am using the shaping/projection features of LINQ to instead define a new anonymous type on the fly, and I am defining two properties on it - Name and Feed - that I want populated using the and sub-elements under each element:
Select Name = feed.Element("Name").Value, Url = feed.Element("Url").Value
As you can see above (and below), I can then work against this returned sequence of data just like I would any collection or array in .NET. VS 2008 provides full intellisense and compilation checking support over this anonymous type sequence:
Hmm - What is this "anonymous type" thing?
In my code above I've taken advantage of a new language feature in VB and C# called "anonymous types". Anonymous types enable developers to concisely define inline CLR types within code, without having to explictly define a formal class declaration of the type. You can learn more about them in my previous New "Orcas" Language Feature: Anonymous Types blog post.While anonymous types can be super useful when you want to locally iterate and work with data, we'll often want/need to define a standard class when passing the results of our LINQ query between multiple classes, across class library assemblies, and over web-services.
To enable this, I could define a non-anonymous class called "FeedDefinition" to represent our Feed data like so:
Using LINQ to XML to Retrieve a Remote RSS XML Feed
The XDocument.Load(path) static method supports the ability open both XML files from the file-system, as well as remote XML feeds returned from an HTTP URL. This enables you to use it to access remote RSS feeds, REST APIs, as well as any other XML feed published on the web.For an example of this in action, let's take a look at the XML of my blog's RSS feed (http://weblogs.asp.net/scottgu/rss.aspx)
I could then take advantage of the composition features of LINQ to perform a further sub-query on the result, so that I filter over only those RSS posts that were published within the last 7 days using the code below:
Using LINQ Sub-Queries within a LINQ to XML Query Expression
If you look at the raw XML of my RSS feed, you'll notice that the "tag" comments for each post are stored as repeated elements directly below each element:
This "shaping" power of LINQ, and its ability to take flat data structures and make them hierarchical (and take hierarchical data structures and make them flat) is super powerful. You can use this feature with any type of data source - regardless of whether it is XML, SQL, or plain old objects/arrays/collections.
Putting it all Together with a Simple RSS Feed Reader
The code snippets I've walked through above demonstrate how you can easily write LINQ to XML code to retrieve a list of RSS feeds from a local XML file, and how to remotely query an RSS feed to retrieve an individual feed's details and individual item post contents. I could obviously then take the resulting feed contents and data-bind it to a ASP.NET GridView or ListView control to provide a nice view of the blog feed:Summary
LINQ to XML provides a really powerful way to efficiently query, filter, and shape/transform XML data. You can use it both against local XML content, as well as remote XML feeds. You can use it to easily transform XML data into .NET objects and collections that you can further manipulate and transfer across your application.
No comments:
Post a Comment