Getting XML Out of the Database - SQL Server 2008

Many developers and database administrators (DBAs) want to keep their relational data as relational but transfer that relational data to other systems as XML due to XML’s flexibility and universal support. FOR XML (added in SQL Server 2000) allows you to get your relational data back in an XML format without needing to store that relational data as XML.

FOR XML

You probably use FOR XML today, as it’s the easiest way to take data in a relational format from SQL Server and put it into an XML format. A simplified form of the FOR XML query extension syntax is the
following:

SELECT column list
FROM table list
WHERE filter criteria
FOR XML RAW | AUTO | EXPLICIT [, XMLDATA] [, ELEMENTS]
[, BINARY BASE64]

Using FOR XML Modes

As shown in the preceding syntax, you can use FOR XML in RAW, AUTO, or EXPLICIT mode. Most people use the AUTO or RAW mode. The main reason people normally don’t use EXPLICIT mode is that the other two modes meet their needs. The other reason, as you’ll see, is EXPLICIT mode is an explicit pain to work with. If you can get away with using the other two modes, we recommend you do that, since you’ll find yourself pulling your hair out if you do any complex XML structures with EXPLICIT mode.

RAW Mode

When working in RAW mode, the FOR XML query returns columns as attributes and rows as row elements.

An example of FOR XML RAW is shown here:

USE pubs
GO
SELECT * FROM Authors FOR XML RAW

Here are the results (truncated for brevity):

<row au_id="172-32-1176" au_lname="White" au_fname="Johnson"
phone="408 496-7223" address="10932 Bigge Rd." city="Menlo Park"
state="CA" zip="94025" contract="1"/>
<row au_id="213-46-8915" au_lname="Green" au_fname="Marjorie"
phone="415 986-7020" address="309 63rd St. #411" city="Oakland"
state="CA" zip="94618" contract="1"/>
<row au_id="238-95-7766" au_lname="Carson" au_fname="Cheryl"
phone="415 548-7723" address="589 Darwin Ln." city="Berkeley"
state="CA" zip="94705" contract="1"/>
<row au_id="267-41-2394" au_lname="O&apos;Leary" au_fname="Michael"
phone="408 286-2428" address="22 Cleveland Av. #14" city="San Jose"
state="CA" zip="95128" contract="1"/>

As you can see, there is a row element for each row, and each non-null column has an attribute on the row element. If you are retrieving binary data, you need to specify BINARY BASE64. Also, if you want to retrieve an XML Data schema with the returned XML, you can specify XMLDATA.

AUTO Mode

When working in AUTO mode, the FOR XML query is the same as RAW mode in that it returns each row as an element with column values as attributes, except the name of the element representing the row is the table name. Here is an example using the authors table:

USE pubs
GO
SELECT * FROM authors FOR XML AUTO
You’ll see the following results (truncated for brevity):
<Authors au_id="172-32-1176" au_lname="White" au_fname="Johnson"
phone="408 496-7223" address="10932 Bigge Rd." city="Menlo Park"
state="CA" zip="94025" contract="1"/>
<Authors au_id="213-46-8915" au_lname="Green" au_fname="Marjorie"
phone="415 986-7020" address="309 63rd St. #411" city="Oakland"
state="CA" zip="94618" contract="1"/>
<Authors au_id="238-95-7766" au_lname="Carson" au_fname="Cheryl"
phone="415 548-7723" address="589 Darwin Ln." city="Berkeley"
state="CA" zip="94705" contract="1"/>
<Authors au_id="267-41-2394" au_lname="O&apos; Leary" au_fname="Michael"
phone="408 286-2428" address="22 Cleveland Av. #14" city="San Jose"
state="CA" zip="95128" contract="1"/>
<Authors au_id="274-80-9391" au_lname="Straight" au_fname="Dean"
phone="415 834-2919" address="5420 College Av." city="Oakland"
state="CA" zip="94609" contract="1"/>
<Authors au_id="341-22-1782" au_lname="Smith" au_fname="Meander"
phone="913 843-0462" address="10 Mississippi Dr." city="Lawrence"
state="KS" zip="66044" contract="0"/>

The table name is the element for each node with the column values as attributes on that element. The nesting of elements depends on the order in your SELECT clause, so choose your order carefully. Furthermore, you cannot use a GROUP BY, but you can use an ORDER BY in your SELECT statements with FOR XML. The work-around for a GROUP BY is to use a nested SELECT statement to achieve the results you want, but this will have some performance implications. When using joins, you’ll find that AUTO will nest the result set, which is most likely what you want to happen. If you don’t want this to happen, you’ll need to use the EXPLICIT mode to shape your XML. For example, if we join publishers and titles, and we want all titles nested under their publisher in our XML, we would run the following code:

USE pubs
GO
SELECT Publishers.Pub_Name, Titles.Title, Titles.Price
FROM Titles, Publishers WHERE Publishers.Pub_ID = Titles.Pub_ID
FOR XML AUTO

The following are the results (truncated):

<Publishers Pub_Name="Algodata Infosystems">
<Titles Title="The Busy Executive&apos;s Database Guide" Price="19.9900"/>
<Titles Title="Cooking with Computers:
Surreptitious Balance Sheets"
Price="11.9500"/>
</Publishers>
<Publishers Pub_Name="New Moon Books">
<Titles Title="You Can Combat Computer Stress!" Price="2.9900"/>
</Publishers>
<Publishers Pub_Name="Algodata Infosystems">
<Titles Title="Straight Talk About Computers" Price="19.9900"/>
</Publishers>

You can also use the ELEMENTS option with FOR XML AUTO. If you are more of an element than attribute person, you can have AUTO return back element-centric syntax rather than attribute-centric syntax. Personally, we find that element-centric syntax, while making the XML larger in text size because of all the opening and closing tags, results in XML that is easier to read and understand.

Explicit Mode

As the name implies, EXPLICIT mode allows you to completely control the way that your XML is generated. You describe what you want your XML document to look like, and SQL Server fills in that document with the correct information. You use a universal table to describe your XML document. This table consists of one table column for each value you want to return as well as two additional tags: one that uniquely identifies the tags in your XML and another that identifies your parent-child relationships. The other columns describe your data. An example of a universal table appears in Table.

A Universal Table

You use directives to describe how to display your data in the table. Directives are just special commands that you use so that SQL Server understands how to parse. The format for these directives is as follows:

Element!Tag!Attribute!Directive

The different pieces of your directive are separated by an exclamation point. Imagine we want to display authors, but make the au_id an attribute on our XML and the rest of our data elements in our output. Well, we can’t do that with FOR XML AUTO or RAW, since neither of them can be split between being attribute- or element-centric. Let’s see what our query would look like to do this:

SELECT 1 AS Tag, NULL AS Parent,
au_id AS [Authors!1!au_id], au_lname as [Authors!1]
FROM authors FOR XML EXPLICIT

The first thing you’ll notice is that in our column list we have the Tag and Parent columns. We need these columns to identify the tag of the current element, which is an integer from 1 to 255, and also the parent of the current element. In this example, we’re not nesting our data, our parent is always null, and our tag is always 1, since we always refer to the same parent. You can see we use the AS clause to rename our data to describe the XML formatting we want to do. The naming for au_id tells SQL Server that we want to use the Authors element, a tag ID of 1, and the name of our attribute. Since we want the other data to be elements, we just rename them to be the element and tag name. At the end, we specify FOR XML EXPLICIT, since we don’t want to get our universal table back, which describes our XML structure, but our actual processed XML structure. The results of this query are shown here:

<Authors au_id="409-56-7008">Bennet</Authors>
<Authors au_id="648-92-1872">Blotchet-Halls</Authors>
<Authors au_id="238-95-7766">Carson</Authors>
<Authors au_id="722-51-5454">DeFrance</Authors>
<Authors au_id="712-45-1867">del Castillo</Authors>
<Authors au_id="427-17-2319">Dull</Authors>
...

You can see that we get the last name returned as element data for our Authors element. We may want to make the last name an element itself nested under the Authors element. To do this, we modify our query slightly to use the element directive, as shown here:

SELECT 1 as Tag, NULL as Parent, au_id as [Authors!1!au_id],
au_lname as [Authors!1!au_lname!element]
FROM Authors FOR XML EXPLICIT

Table lists all the directives you can use with a description for each.

FOR XML EXPLICIT Directives

FOR XML EXPLICIT Directives
When using FOR XML EXPLICIT queries, many people get the error that the parent tag isn’t open yet. To troubleshoot your FOR XML EXPLICIT statements, the easiest way to fix problems is to remove the FOR XML EXPLICIT statement and just render the results. This will return the universal table, and you can track down errors. The easiest way we’ve found to solve the parent tag problem is to make sure to include the tag column in your ORDER BY clause so that you know that no later tags will be rendered before earlier tags, which is the cause of the problem.

Let’s look at a more complex example. If we want to return all our authors with their titles and author royalties, we would generate a UNION ALL query to combine together this data from disparate tables, and we would need to nest the results so that our XML hierarchy appears correctly with authors, royalties, and then titles. Notice that we define a number of parent-child relationships using the Tag and Parent columns. Also notice that we use the ORDER BY trick to make sure that the parent tags are in the XML before we process the children tags.

SELECT 1 AS Tag, NULL AS Parent,
Authors.au_fname AS [Authors!1!au_fname!element],
Authors.au_lname AS [Authors!1!au_lname!element],
NULL AS [Titleauthor!2!Royaltyper],
NULL AS [Titles!3!Title!element]
FROM
Authors
UNION ALL
SELECT 2 AS Tag, 1 AS Parent,
au_fname,
au_lname,
royaltyper,
NULL
FROM Authors INNER JOIN Titleauthor ON
Authors.au_id = Titleauthor.au_id
UNION ALL
SELECT 3 AS Tag, 2 AS Parent,
au_fname,
au_lname,
royaltyper,
title
FROM Authors INNER JOIN Titleauthor ON Authors.au_id = Titleauthor.au_id
INNER JOIN Titles ON Titles.title_id = Titleauthor.title_id
ORDER BY [Authors!1!au_fname!element], [Authors!1!au_lname!element],
[Titleauthor!2!royaltyper], Tag
FOR XML EXPLICIT

Here is the truncated universal table:

truncated universal table

And here are the truncated results:

<Authors>
<au_fname>Abraham</au_fname>
<au_lname>Bennet</au_lname>
<Titleauthor Royaltyper="60">
<Titles>
<Title>The Busy Executive&apos;s Database Guide</Title>
</Titles>
</Titleauthor>
</Authors>
<Authors>
<au_fname>Akiko</au_fname>
<au_lname>Yokomoto</au_lname>
<Titleauthor Royaltyper="40">
<Titles>
<Title>Sushi, Anyone?</Title>
</Titles>
</Titleauthor>
</Authors>
...

As you’ve seen, FOR XML EXPLICIT is powerful, yet it can be hard to master. If you can get away with using FOR XML AUTO or RAW and avoid FOR XML EXPLICIT mode, your coding will be much easier. However, for those situations where FOR XML AUTO or RAW doesn’t meet your needs, you can always fall back to EXPLICIT mode.

Returning Values As XML Types

To support returning values using the XML datatype, FOR XML provides an extra directive called TYPE.By passing this directive to your call, instead of generating the XML and returning it as text, SQL Server returns the result as an XML datatype. This means that you can then use XQuery on that returned value to query for information inside the result set. Also, you can assign the results to a variable or insert into an XML datatype column. Finally, you can nest FOR XML statements to generate a hierarchy, rather than needing to resort to using XML EXPLICIT. This capability allows you to quickly convert your relational data or even your data that uses the XML datatype into an XML datatype value.

The following code shows how to use the new TYPE directive and then pass the result to an XQuery, which you’ll learn about in the next chapter:

SELECT (SELECT * FROM authors FOR XML AUTO, ELEMENTS, TYPE).query( 'count(//author)')

Specifying a Column’s Location in the Hierarchy

The PATH mode allows you to specify where a column’s value should appear in your XML hierarchy by using XPath. By having this capability, you can move away from FOR XML EXPLICIT, which is complexand burdensome, and instead generate complex XML using nested FOR XML statements and the new PATH mode.

An example of the new PATH mode is shown here. This sample renames the root element to AuthorsNew and also makes a new complex type called Names, which stores the first name and thelast name.

SELECT au_fname "Names/FirstName", au_lname "Names/LastName"
FROM authors FOR XML PATH('AuthorsNew')

Generating In-Line XSD

Another enhancement is support for in-line XSD in RAW and AUTO modes. You can optionally pass theXML SCHEMA directive in your code. The following example shows using the XMLSCHEMA directive and the results returned from the server:

SELECT * FROM authors FOR XML RAW('Authors'), XMLSCHEMA('urn:example.com')
<xsd:schema targetNamespace="urn:example.com"
xmlns:xsd=http://www.w3.org/2001/XMLSchema
xmlns:sqltypes="http://schemas.microsoft.com/ sqlserver/2004/sqltypes"
elementFormDefault="qualified">
<xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes"
schemaLocation=
"http://schemas.microsoft.com/sqlserver/2004/ sqltypes/sqltypes.xsd" />
<xsd:element name="Authors">
<xsd:complexType>
...
</xsd:complexType>
</xsd:element>
</xsd:schema>
<Authors xmlns="urn:example.com" au_id="172-32-1176"
au_lname="White" au_fname="Johnson" phone="408 496-7223"
address="10932 Bigge Rd." city="Menlo Park"
state="CA" zip="94025" contract="1" />

Using FOR XML on the Client Side

Up until now, we have been writing our FOR XML code so that it is processed on the server. However, using SQLXML, you can process your FOR XML code on the client side. Rather than sending back the formatted XML results from the server, SQL Server can send back the rowsets to SQLXML, and SQLXML will format the results on the client side. You’ll see an example of this in the upcoming “Programming SQLXML from .NET and COM” section.

Templates to Improve Performance

You can use templates to execute queries against SQL Server with SQLXML. Templates are just encapsulation of the technologies we’ve already looked at in this chapter. These templates can use SQL or XPath queries. You need to use the annotated schema that you create for your XML view with your template. The schema can be in-line or loaded via a file.

To specify your template, create a file that uses the urn:schemas-microsoft-com:xml-sql namespace. Then you can pass in your SQL or XPath queries in the template. SQLXML will cache your templates in order to improve performance. The following template executes a SQL query:

<Root><sql:query xmlns:sql=""urn:schemas-microsoft-com:xml-sql"">
SELECT * FROM Authors FOR XML AUTO</sql:query></Root>

To use an XPath query, we would change the sql:query syntax to sql:xpath-query. The sample that follows queries for all authors:

<Root><sql:xpath-query xmlns:sql="" urn:schemas-microsoft-com:xml-sql"">/Authors</sql:xpath-query></Root>

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SQL Server 2008 Topics