pureXML in DB2 - IBM DB2

pure XML is the brand new technology introduced in DB2 9. pureXML enables storage of well-formed XML documents in the DB2 column of XML data type. By storing XML data in XML columns, the data is kept in its native hierarchical form, rather than stored as text or mapped to a relational data model. Since the new XML store is 100% integrated with DB2, it leverages the performance, scalability, reliability, and availability of the DB2 engine.

The new XML data type is really a Structured Query Language (SQL) data type, so querying XML data is easy and fast. You can access relational and XML data in the same statement. The new pureXML is also integrated with application programming interfaces (APIs) such as Java Database Connectivity (JDBC), Open Database Connectivity (ODBC), .NET, embedded SQL, and PHP, which open doors for a new breed of Web applications with hybrid access to the relational and XML data. In this section, we discuss the application design considerations and benefits of pureXML technology in DB2 with respect to application development.

When to use DB2 pureXML
DB2 pureXML is suited for any application where some or all of the data processed by the application is represented using XML. DB2 pureXML storage provides for high performance ingestion, storage, processing, and retrieval of XML data. Additionally, it also provides the ability to quickly and easily generate XML from existing relational data.The types of applications for which pureXML is particularly well suited include:

Business-to-business and application-to-application (A2A) integration

  • Internet applications
  • Content-management applications
  • Messaging-based applications
  • Web Services

A typical XML-based application has one or more of the following requirements and characteristics:

  • XML documents need to be processed or generated.
  • High performance querying, both within a document and across a large collection of documents.
  • High levels of security and easy control over them.
  • Usage of languages such as Java that support open standards such as SQL, XML, XQuery, and Extensible Stylesheet Language Transformation (XSLT).
  • Access to information using standard Internet protocols such as File Transfer Protocol (FTP), HTTP/Web, or JDBC.
  • Hybrid access to and ability to query relational and XML data.
  • Validation of XML documents.

Designing pureXML-based applications
This section discusses the preliminary design criteria you can consider when planning your DB2 pureXML-based application. Here are several questions to ask and our recommendations:

  • Will your data be mostly XML, a combination of relational and XML, or non-XML?
  • Deciding how your data storage model will look mostly depends on the kind of application you are developing and the kind of data it will use. Sometimes, it might make more sense to store XML data in a relational table and query it using standard SQL if your XML data is highly regular and queried the same way over time. Often, you would want to store and represent data as XML throughout its life.

  • Will your tables be XML schema-based or non-schema-based?
  • If your data is XML schema-based, you need to register the schema with DB2 XML schema repository, so that applications can use them to validate the data. You can experience performance issues if schema validation is used.

  • How will other applications and users access your XML and other data?
  • The data access model really depends on the programming interface. DB2 supports the common application programming interfaces to access and manage XML data in pureXML.
  • What kind of indexing will best suit your application? Will you need to use XML value index or full text search index? Or both?
  • You need to design indexes carefully for high performance data access. Text-based search or fuzzy search implementation needs to use full text indexes. This will depend on the nature of the application.

  • In which languages will you program your application?
  • Java, .Net, and PHP are the most common languages a Web application uses. However, you might consider the latest community developed frameworks such as Ruby on Rails or Zend Framework to build Web applications that leverage persistent XML store in DB2.

  • Will you need to generate XML from relational data or vice versa?
  • Your application can leverage XML constructor functions to build XML from relational data before applications can process it. In contrast, some use cases require XML data to be shredded into relation columns before the data can be accessed by an existing application.

  • How often will XML documents be accessed, updated, and manipulated? Will you need to update fragments or the whole document? Choosing the right query and interface model depends on whether an application is read only or both read and update. XQuery can limit your application to read only, so you need to carefully decide if full document update is reasonable.
  • Will you need to transform the XML to HTML, WML, or other languages, and how will your application transform the XML?
  • If your application needs to be accessed by different clients, XSLT may be used to transform. While storing data in pureXML, you can make sure the structure complies with the specific XML schema.

DB2 hybrid query engine
DB2 hybrid engine processes SQL or XML and XQuery queries in an integrated manner. DB2 unifies XML storage, indexing, and query processing with existing relational storage, indexing, and query processing. DB2 also provides an XML schema repository (XSR) to register and maintain XML schemas and uses those schemas to validate XML documents. The DB2 utilities such as IMPORT and EXPORT have been enhanced to support XML data and the new graphical XQuery builder lets you construct XQueries.

The hybrid engine allows existing client applications to access DB2 data through relational APIs, yet it offers new SQL or XML APIs to publish relational data in XML format and full document retrieval from pureXML storage. Additionally, the new SQL or XML querying functions provide SQL applications with subdocument level search and extract capabilities by embedding XQuery statements into SQL statements.

An XML application can interact with the DB2 server via the XML interface by using the XQuery language, which is supported as an stand-alone query language independent of SQL. XQueries can optionally contain SQL statements to combine and correlate XML data with relational data. A client application can benefit immensely from this integration of the two languages supported by DB2.

DB2 has a separate parser for SQL and XQuery statements but uses a single integrated query compiler for both languages. No translation from XQuery to SQL is performed. DB2’s compiler and optimizer are extended to handle SQL and XQuery in a single modelling framework. An overview of the hybrid engine.

Integrating XML and relational data

Integrating XML and relational data

pureXML storage overview
In DB2 pureXML, the XML documents are stored on disk pages in tree structures matching the XML data model. This avoids the mapping between XML and relational structures, and the impedance (hindrance) mismatch. XML is a data type in DB2, just like any other SQL type.

XML data type can be used in a CREATE TABLE statement to define one or more columns of type XML. Since XML has no different status than any other types, tables can contain any combination of XML and relational columns. A column of type XML can hold one well-formed XML document for every row of the table. The NULL value is used to indicate the absence of an XML document.

Even though every XML document is logically associated with a row of a table, XML and relational columns are stored differently. XML schema is not required in order to define an XML column or to insert or query XML data. An XML column can hold schema-less documents as well as documents for many different or evolving XML schemas.

Schema validation is optional on a per document basis. Thus, the association between schemas and documents is per document and not per column, which provides maximum flexibility. Unlike a varchar or a character large object (CLOB) type, the XML type has no length associated with it. Currently, only the client-server communication protocol limits XML bind-in and bind-out to two GB per document. With very few exceptions, this is acceptable for all XML applications.

You can use the XML type not only as a column type but also as a data type for host variables in languages such as C, Java, and COBOL. XML type is also allowed for parameters and variables in SQL stored procedures, user defined functions (UDFs), and externally stored procedures written in C and Java.

An application can manage XML documents in the best way suited for the application. These ways include shredding XML data into relational tables or just storing them as CLOBs. In DB2 pureXML, XML data is stored in a parsed, annotated tree form, similar to (but distinct from) the Document Object Model (DOM). The XML data is formatted to data pages, which are buffered. The benefits of this format include faster navigation, which results in faster query execution as well as simpler indexing of data. Below it shows a DB2 CREATE TABLE statement with XML data type, and how relational and XML data is stored in DB2.

Creating table with XML data type

Creating table with XML data type

Throughout this chapter, we have used examples of XML data, relational data, XQueries, SQL queries, and so on. These examples are based on Movie data. The examples might not be entirely consistent because they were produced to illustrate specific points made in the text and they might not be always complete. The actual XML data and steps to create them in your environment have been discussed in detail in Appendix A, “Setup procedure and sample data”. In summary, we have used two database tables..

Movie database table

Movie database table

Moviereview database table

Moviereview database table

The sample XML data for Movie is shown below

Movie.xml

The sample data for Movie review is shown below

Moviereview.xml

You can find the complete data , “Setup procedure and sample data” . All the example queries in this chapter have been tested against the movie and moviereview data in the DB2 9 database.

SQL support for XML data (INSERT, SELECT)
In DB2 9, SML is a new supported data type. In the CREATE TABLE statement, the XML data type can be specified as a column data type. Just like other data types, while inserting an XML value in the column of type XML, DB2 makes sure that the data type of a value is XML.

XML value is made of nodes of different data types and can be represented and stored in any of the character types, such as CHAR, VARCHAR, CLOB, BLOB, and so on. Before inserting a value in the XML columns, DB2 makes sure that the string representation contains a valid XML document.

Similarly, when selecting the XML value, it should be converted to any of the character data types so that it can be presented or used in the application. DB2 provides functions to explicitly check the correctness of the XML value before inserting and converting the XML value to a string representation while selecting. An XML document can have the encoding information mentioned in the document itself in the declaration section. Such documents are called internally encoded documents. The encoding used for the documents, which do not contain this information, depends on the application code page.

XMLPARSE
You can optionally use the XMLPARSE function to explicitly check the correctness of the XML document in an SQL statement. The input to this function can be any of the character types: CHAR, VARCHAR, and CLOB, or BLOB. The function returns the parsed XML value if it is a valid XML document; otherwise, it throws an error.

As only the valid XML values are allowed in an XML column, the XMLPARSE function is mandatory to use when inserting the XML value either externally or internally. If the XMLPARSE function is not used in an insert statement explicitly, it is used implicitly to make sure that the value is correct XML. This type of parsing is called implicit parsing.

Whenever a character type is passed to the XMLPARSE function, it has an external encoding associated with it. It can be possible that the XML value stored in the character type variable is also internally encoded. The internal code page of the XML value in the character type variable and the external code page associated with the variable should match. If these code pages do not match, a run-time error is thrown. We recommend you not use the encoding attribute whenever a character type is used to represent XML data. For binary data types, there is no code page associated with the type, the XML value is treated as encoded in the internal code page specified in the XML declaration. In case there is not internal encoding associated with the value, XML is treated as unicode.

Tip: Implicit parsing is the recommended way to work with XML data because it takes care of any encoding conversion in the best possible manner.

Below it gives an example of an INSERT statement where XMLPARSE is used explicitly to make sure that the value is well formed XML.

XMLParse function

A host variable or parameter of type XML cannot be given as input to the XMLPARSE function. They are implicitly parsed by the database. XMLPARSE does not perform validation against a schema and only provides the parsing functionality to check the correctness of the XML value and how well-formed the XML value is. The full syntax for the XMLPARSE is:

XMLPARSE (DOCUMENT <String Value> PRESERVE/STRIP WHITESPACE

XMLPARSE provides two options for extra whitespace processing while parsing the string value:

  • PRESERVE WHITESPACE: Preserve the whitespace in the string value.
  • STRIP WHITEPACE: Remove the extra whitespace. This is the default action if no option is provided.

The DOCUMENT keyword specifies that the string value is a well-formed XML document which conforms to XML Version 1.0.

XMLVALIDATE
You can use the XMLVALIDATE function to validate the XML value against a schema. The schema for an XML value defines the structure of the XML values including node data types, occurrences, default values, and so on. The schema document should be registered to the database before using it for the validation.

Below gives an example of the XMLVALIDATE function. This example assumes that the schema is registered with the name review.

XMLVALIDATE function

XMLVALIDATE takes a well-formed XML value as an input. If a value of character type is passed to this function, the value is implicitly parsed before validation.

XMLSERIALIZE
When selecting an XML column from the database table, it should be serialized to CHAR, VARCHAR, CLOB, or BLOB data types so that the value can be used by the application. Use the XMLSERIALIZE function for this same purpose. It serializes an XML value to its textual representation. XMLSERIALIZE takes an XML value or XML column name as input in an SQL statement.

When serializing the XML value, the code page associated with the serialized data depends on the target data type used. For character data types, the encoding is application code page. For binary data types, the encoding associated with the serialized data is UTF-8.

Below it gives an example of how to use the XMLSERIALIZE function when selecting an XML column.

Using XMLSERIALIZE function

If the SERIALIZE function is not used when selecting an XML column, the XML value is serialized to BLOB value implicitly.

Schema support
XML schema defines the structure of the XML document, including the element definition, attribute definition, namespaces, code pages, and so on. These schemas can be used to validate the XML value before inserting these values into the database table. Generally these schemas are stored in the file system and need to be registered to the database before using them for validation. Registering a schema to the database makes the schema a database entity and removes the dependency of reading the content of the schema document from the operating system. DB2 provides two ways of registering the schema to the database.

Registration using CLP commands
DB2 introduced new CLP commands to register a schema to the database. To register the database using the CLP commands, follow these steps:

  1. Register the primary schema document.
  2. The primary schema document is the one at the top of the import/include hierarchy. Register the main schema document using the following command:

    The terms in the command are as follows:

    – Namespace Uniform Resource Identifier (URI): Namespace of the XML document which will be referenced from XML documents.
    – Location URI: Physical location of the schema file in the file system.
    – Property URI: Physical location of the property document associated with the schema document in the file system.

  3. Add the secondary schemas document.
  4. Add the XML schema document referenced (using the include or import tag) in the main schema document or any of the already added secondary schema documents:
  5. Complete the registration using:

The complete XMLschema command throws an error in the case where all the schema documents referenced in the primary or secondary schema documents are not added to the main schema. For the include tag, the namespace of the primary schema and the included schema should be the same. For the import tag, it is different and should be mentioned in the add xmlschema command.

All the three steps can be combined in one command, register xmlschema, by adding the add and complete clause.

Below it gives the schema example for the REVIEW column in the MOVIEREVIEW table.

Schema for review column

Below it gives the command to register this schema to the database. Because we are not importing any schema within the main schema, there is no need to use the add xmlschema CLP command.

Registering schema

register xmlschema http://movies.org from review.xsd as review complete xmlschema review

Registration using stored procedures
You cannot use CLP commands in the higher programming languages. To register an XML schema using an application program in one of the higher programming languages, DB2 provides three stored procedures equivalent to the three CLP commands

  • SYSPROC.XSR_REGISTER
  • This stored procedure can be used to register the primary XML schema:
    The terms in the command are explained as follows:
    – rschema: Relational schema name for the XML schema. This parameter is of the type VARCHAR(128). NULL value for this argument specifies the CURRENT SCHEMA.
    – name: Relational identifier for the XML schema. This parameter is of the type VARCHAR(128). The fully qualified name for the schema ID becomes rschema.name. – schemalocation: Namespace of the primary XML schema. This parameter is of the type VARCHAR(1000). – content: Content of the XML schema document. This parameter is of the type BLOB(30M). This is a non-nullable parameter. – docproperty: Content of the property document associated with the schema. This parameter is of the type BLOB(5M).
  • SYSPROC.XSR_ADDSCHEMADOC
  • Use this stored procedure to add the XML Schema documents to the primary schema:
    The explanations for the terms in the command are:
    – rschema: Relational schema for XML schema
    – name: Relational Schema name for primary XML schema
    – schemalocation: Namespace for the XML Document
    – content: Content of the XML document as a BLOB variable
    – docproperty: Property document associated with the XML schema document
  • SYSPROC.XSR_COMPLETE
  • Use this procedure to complete the registration:
    The terms in the command are explained as follows:
    – rschema: Relational schema for the XML schema.
    – name: Relational identifier of the primary XML schema.
    – schemaproperty: An input argument of type BLOB (5M) that specifies properties, if any, associated with the XML schema. The value for this argument is either NULL, if there are no associated properties, or an XML document representing the properties for the XML schema.
    – isusedforshred: The value for this parameter is true if the schema is used for decomposition, otherwise, the value is false.

Annotated XML schema decomposition
XML decomposition or shredding is the process of breaking down an XML document into columns of relational tables. Consider that your organization can have an existing relational database repository and business applications built on top of this data, and you want to capture new data coming as XML and use it with existing applications. Consider another case of a need to shred XML documents conforming to different XML schemas in a single relational table. Using the new XML decomposition feature of DB2 Express-C V9 is the right choice for both of these situations.

XML decomposition
XML decomposition can be helpful because:

  • It can break one large XML document into fragments that can be stored in column types such as XML, CLOB, VARCHAR, BLOB, and so on.
  • It can shred multiple items into the same tables-column pair.
  • It can automatically detect one-to-many relationships between columns mapped to the same table.
  • It performs type conversions from an XML schema type to DB2 type.
  • It can validate during shredding.
  • It can specify custom expression on contents before insertion.
  • It can specify path conditions, for example, shred <b> only if the element is in path /a/b.
  • It can specify conditions on tuples.

Performing decomposition
To decompose an XML data, follow these steps:

  1. Create the tables (relational schema) into which you decompose the XML data.
  2. Annotate your XML Schema with mapping information:
  3. – Annotate manually.
    – Use DB2 Developer Workbench.
    – Migrate existing DADs to Annotated XML Schema using tool.
  4. Register it in the XML Schema Repository via a CLP command.
  5. REGISTER XMLSCHEMA ’http://myOrderSchema/order.xsd’ FROM ’file://c:/temp/order.xsd’ AS user1.myOrderSchema COMPLETE ENABLE DECOMPOSITION
  6. Register via the stored procedure.
  7. a. Set the value of the “isusedforshred” to true.
  8. Register via the JDBC function.
  9. Decompose XML documents via the CLP command.
  10. DECOMPOSE XML DOCUMENT c:mydoc.xml XMLSCHEMA john.myschema
  11. Decompose documents by calling the stored procedure.
  12. Choose SP depending on the size of the document to be shredded:
<1MB,<10MB,<25MB,50MB,<75 MB,<100MB XDBDECOMPXML XDBDECOMPXML10MB XDBDECOMPXML25MB XDBDECOMPXML50MB XDBDECOMPXML75MB XDBDECOMPXML100MB

We recommend the new Annotated schema decomposition feature of DB2 9 over the old XML Extender shipped with previous releases of DB2 because the new technique is much faster if XML schema is used, and it has more mapping constructs and other features.

This article also provides DAD mapping techniques and a schema conversion utility. The DAD to annotated XML schema converter utility helps users convert their XML schemas to annotated XML schemas based on the mapping rules described in the DAD. The XML schema and DAD must describe the same set of XML documents, although it is possible and acceptable that the XML schema can describe a super set of documents described by the DAD. Users who do not have an XML schema can easily generate XML schemas from corresponding DTDs and even XML documents by using tools available freely on the Internet. The DAD to annotated XML schema converter tool can then take the XML schema and the DAD as input to produce an annotated XML schema.

The tool supports import, include, and redefine construct of the XML schema. In other words, if an XML schema is spread across many XML schema documents through import, include, or redefine, only the path for the primary schema document, the document through which all XML schema documents can be reached through either import, include, or redefine, is needed. The tool annotates element or attribute declarations across schema documents. Note that since DADs do not support namespaces, it is impossible to have the use of an import construct in this scenario.

Tip: The tables and columns must preexist before the annotated XML schema is enabled for decomposition. All missing objects (tables or columns) to which the XML schema refers are reported as errors. Also the CLP command automatically calls the correctly sized stored procedure depending on the size of the XML document. If a stored procedure with 100 MB is called for documents smaller than 1 MB, then the application’s performance might be adversely affected due to high memory consumption.

XML query support
For querying XML values, DB2 supports XQuery language, which you can use in conjunction with SQL to retrieve data from the XML columns. So, along with the SQL interface, DB2 provides another interface, XQuery, to query the XML values. These interfaces interact with each other using SQL or XML functions.

Using these functions, XQuery can be embedded into an SQL statement to fetch the part of the XML document from each row of the SQL statement result. To apply the pure XQuery on a sequence of XML documents fetched from the database, DB2 provides resource functions to fetch the XML values.

This section discusses the following:

  • Functions used to embed SQL statement in an XQuery (XML source functions for XQuery)
  • Writing XQuery
  • Function used to embed XQuery statements in SQL (SQL or XML functions)

XML source function
DB2 XQuery can be applied either on the data stored in the XML column of the tables or an XML-created document using the constructor functions. For applying XQuery on XML column data, XML values must be fetched from the column to create a sequence of XML values. DB2 provides the following function to create a sequence of XML values by fetching the data from XML columns.

xmlcolumn
The xmlcolumn function takes the qualified name of the XML column and returns the sequence of XML documents stored in the XML column.

The syntax for the xmlcolumn function is:

Default value for the schema is the current schema. Below gives the example for an xmlcolumn applied on the info column of the movie table.

xmlcolumn applied on info column of the movie table

sqlquery
Instead of querying all the XML values in an XML column (as in the case of xmlcolumn functions), we can only query a set of XML values based on a condition in a SELECT statement using the sqlquery function, for example, querying only the information regarding the movie with ID value 123. For this purpose, the sqlquery function provides you with the option to give the SQL full select as an input to the function instead of just the column name. If you give the full select as an input to this function, you should select XML values only. The function returns the concatenation of all the values selected by the full select. Below shows how to use the sqlquery functions to select the XML values.

sqlquery

Writing XQuery
The XQuery language queries the XML data. XQuery works on a sequence of XML documents generated by either using the resource functions.constructor function or created as an intermediate result of another XQuery. Use XQuery, along with SQL/XML functions, to query the XML documents stored in the DB2 9 database.

Tip: DB2 is case insensitive and treats all the table and column names in capital letters while XML and XQuery are case sensitive. The resource functions previously discussed are XQuery interface functions so all the table names and column names should be passed to these functions using capital characters or letters. Passing the object names in lower case letters can result in an undefined object name error.

Tip: xmlcolumn and sqlquery are xquery interface functions and are case sensitive. Trying to use these functions in upper case letters results in an error.

Paths and predicates
XQuery queries the XML data based on the path of the nodes in the XML hierarchical structure. The nodes in the XML document relate to each other. The relationships that are supported between two nodes are:

  • Child
  • Attribute
  • Descendant
  • Parent
  • Self
  • Descendant or self

You can query an XML node based on these relationships by providing the path of the node in the XML value.

Below it shows a small query, which select all movies from the MOVIES.INFO table. A path based on the previous relationship is used to reach the title node in the XML value.

Axis and path in XQuery

The forward slash (/) in the query indicates the next step in the path followed by the axis (relationship) to move and the node name.

The query uses the relationship name to indicate the next step or direction in the path. This format is called the abbreviated version. There is also the unabbreviated format, which uses the name of the relationship to indicate the next step.compares the syntax between the unabbreviated and abbreviated formats. The abbreviated format that is defined for each relationship reduces the size of the query and is the commonly used format.

Abbreviated and unabbreviated syntax

Abbreviated and unabbreviated syntax

So the default axis is the child axis. The attribute axis can be represented by the at sign (@). Similarly, the descendant axis can be represented by two forward slashes (//).

You can filter the values selected by the path in the Query by a condition in the same way we filter the value using the WHERE clause of the SELECT statement. You do this by using the predicate in XQuery. Let us select all the US-based movies from our movies database. Example gives you the query to get the required result.

Predicate in XQuery

The predicate in an XPath starts with a bracket ([) and ends with the closing bracket (]). A predicate is evaluated for the given condition and returns true if it satisfies the condition. The nodes for which the predicate returns true are returned as a result of the current step. The predicate can start from the root node or from the relative path with respect to the current node. Putting (/) at the start of the predicate makes it start from the root node. A predicate can go up to any level in the XML value using the paths and return back to the original node after evaluating the condition. The following comparison expressions are supported in the predicate:

  • Value comparison (ne, eq, gt, ge, lt, le)
  • These operators are used to compare the values of the same data types. For example:
    A ne B
    A and B should be of the same data type.
  • General comparison (!=, =, >, <, <=, >=)
  • These operators are used to compare the sequence. If any of the sequence members satisfy the operator with the corresponding value in the other sequence, the operator returns true:
    – (1,2)=(2,3) returns true.
    – (1,2)=(3,4) returns false.
    Similarly:
    – (1,2)!=(2,3) returns true because at least one set is satisfying the operator.
    – (1,1)!=(1,1) returns false.
    Apart from these, you can use the logical operators AND and OR.

Constructors and FLWOR expressions
DB2 XQuery allows us to create our own XML structure from the existing XML values. To do so, it allows XML style constructors to create the XML document.Below it shows a simple query that will get all of the five star movie titles (with rating as “*****”) from the table and put into a new tag movie-list

Constructors in XQuery

DB2 supports FLWOR expressions to allow you more flexibility to restructure the existing XML values. FLWOR means for, let, where, order by, and return.

FLWOR expression syntax is:

The for and let clauses in a FLWOR expression generate an ordered sequence of tuples of bound variables. While for iterates over the different values in the sequence, the let clause makes one binding with the value of full sequence. The optional where clause serves to filter the tuple stream, retaining some tuples and discarding others.

Use the optional order by clause to reorder the tuple stream. The return clause constructs the result of the FLWOR expression. The return clause is evaluated once for every tuple in the tuple stream, after filtering by the where clause and ordered by order by clause, using the variable bindings in the respective tuples.

Tip: When putting an XQuery inside the XML tag, enclose it in braces ({}) to indicate that DB2 puts the result of the XQuery in the output; otherwise, DB2 takes the XQuery as a normal string and we see the normal XQuery text in the output.

Below it shows a simple FLWOR expression, which selects the five star (rating is “*****”) movie’s name.

Using FLWOR expression

You can nest the for and let clauses of the FLWOR expression to any level. The nesting of these clauses lets us combine different parts of the XML value. Example shows the nesting of these clauses in conjunction with the constructors. The query selects the movies based on their MPAA rating, MPAArating.

Nesting FLOWOR expression


Use the order by clause to order the tuples based on the path expression. These ordered tuples are then processed by the return clause.Below it gives an example of the order by clause. The query selects the movie name with the production studio order by the rating and the year of production.

Using order by clause

Tip: When nesting the for and let clauses, keep in mind that the for clause creates one binding for each XML value in the sequence, and the let clause creates only one binding for the full sequence.

The FLWOR expression can be used to join XML values from different columns, too. Below it gives an example query, which selects the reviews from the MOVIEREVIEW table for all the movies from RR Pictures production.

Joining two XML column values

SQL/XML functions
The XQuery and SQL interfaces interact with each other using SQL/XML functions. Use these functions to embed the XQuery in the SQL/XML statement or SQL in the XQuery statement.

xmlquery function
You typically use this function in the column list of the SELECT statement to select a part of the document instead of the full value. This function is useful when you need to select the part of the XML value based on the condition on a rational column. Another significant advantage of this function is that it allows passing the relational column to the XQuery via its passing by clause. The table name for the column, to which the passing by clause refers, should be present in the FROM clause of the SELECT query.

Below it gives an example for xmlquery function. The function selects the movie title with id=123.

Using xmlquery function

xmlquery function returns the single XML value for each row selected by the SELECT statement. The result of the xmlquery function should be a single XML value per row of the result set (xmlquery function throws an error when the result is a sequence of XML values for a single row in the result set).

xmltable function
Use this function generally in the FROM clause of the SELECT statement to create a table from the XML value. Similar to the xmlquery function, you can use the passing by clause here to pass any XML value or relational value to the function. Below it shows an example that uses the xmltable function to create the relational table.

Creating relational table using xmltable

The xmltable function is useful when you want to move the data from an XML document to the relational table and you do not have the schema for the documents (or when you do not have annotated schema to do the same).

xmlexists predicate
xmlexists tests whether an XQuery expression returns a sequence of one or more items. xmlexists is similar to the xmlquery function except the return type. When xmlquery function returns the XML value, xmlexists returns the Boolean value. It returns true if the result of the xquery used in the function is not empty,

Tip: Use xmlcolumn and sqlquery resource functions inside the xmlquery function. But because these functions are applied to all the rows to concatenate the values first before applying the xquery path expression, the result of the xmlquery function might remain the same for all the rows selected by the SELECT clause.

To apply the XQuery on each individual row selected by the SELECT statement separately, use the passing by clause. The passing by clause passes the XML column value to the xquery for each row and returns the value along with other column values selected. All the tables to which the passing by clause refers should be there in the FROM clause of the SELECT statement. otherwise, it returns false. xmlexists also uses the passing by clause to pass the relational and XML column to the xquery and is useful when the filter condition is based on the comparison of the relational column with the XML node value. xmlexists is generally used in the WHERE clause of the SELECT statement.shows you a query, which returns the IDs for all the movies with genres Drama, Romance, and Remake.

Using xmlexists

Use xmlexists in conjuction with the xmlquery function to select the part of the XML document based on a comparison between a node value and relation column value.

below it shows you a query, which selects the movie name and the corresponding reviews as two separate columns of the result set.

Using xmlexists in conjuction with xmlquery

xmlcast function
The xmlcast function is used to cast the XML value to other relational data types. It is equivalent to the existing cast function for relational data. xmlcast first resolves the XML value to an XQuery atomic data type and then does the conversion to the relational data type. Example shows the usage of the xmlcast function. The resulting columns are of XML type. Using xmlcast function, the values can be casted to other data types.

Using xmlcast

Constructor function (publishing functions)
An XML value to be inserted into the XML type column can either be retrieved from the application or can be created from the other relational columns. DB2 provides publishing functions to create a new XML value from the existing relational columns. Functions provided by DB2 to create a single node for an XML value include:

  • XMLELEMENT
  • XMLELEMENT function can be used to create a new element node for an XML value. This function takes inputs, the name of the element, any attribute or namespace declaration, and the value of the node. The value of the node can be a complex type (concatenation of other elements) or a value of a relational column.

  • XMLATTRIBUTES
  • XMLATTRIBUTES function is used to create the attributes of an element node. Because an attribute is always tied with the element, this function can be used only inside the XMLELEMENT function to create the attribute. This function takes input, the name of the attribute, and the value (any relational column).

  • XMLNAMESPACE
  • XMLNAMESPACE is used to declare a namespace for an XML value.
  • XMLCOMMENT
  • XMLCOMMENT function is used to create the comment node for the XML value. As a comment node can occur anywhere in the XML document, this function can be used anywhere to create the comment node.
  • XMLDOCUMENT
  • XMLDOCUMENT function creates a document node for an XML value. Each XML value stored in the XML column must have a document node. A document node is associated with the XML value with a single root node. It is a logical way of making sure that the XML value has a single root node and not a sequence of XML nodes.
  • XMLPI
  • XMLPI function creates a processing instruction node for the XML value.
  • XMLTEXT
  • XMLTEXT function can be used to create a text node in an XML value. The input to this function is the value of the text node. Functions to concatenate or aggregate more than one node include:
  • XMLCONCAT
  • XMLCONCAT function concatenates the multiple XML nodes and returns the sequence of the nodes as a result. XMLCONCAT is a scalar function and only returns one XML value per row of the input expression. It takes only the existing XML value or column as input.
  • XMLFOREST
  • XMLFOREST function creates a new list of element nodes. It takes the values of any data type as input and creates an element for each value given.
  • XMLAGG
  • XMLAGG is an aggregate function. It takes a set of rows of XML values and produces a single XML result.

Below it shows how to use these functions to create an XML value from relational and XML columns. The example combines the information and reviews for a movie in a single XML value.

Using publishing functions

XML indexing
Indexes over XML data can improve the performance of queries on XML columns. Similar to relational index, an XML index over XML data indexes an entire column; however, there are some externally visible differences between relational indexes and XML indexes as follows:

  • Indexes are created on columns of type XML based on path expressions (xml pattern): a subset of XPath that does not contain predicates among other things.
  • When creating an index, it is possible to specify what paths to index and what types. Use the type that you want to use in your queries.
  • You can only index on a single XML column, composite indexes are not allowed at this time. Elements and attributes inside the document frequently used in predicates and cross-document joins can be indexed.
  • If a node matches the xmlpattern but fails to cast to the specified index type, then no index entry is created for the node without raising an error.
  • There could be zero, one, or multiple index entries per document (row) based on how many nodes match an xmlpattern.
  • A single document can contain zero, one, or multiple nodes that match the xmlpattern. Thus, there can be zero, one, or multiple index entries for a single row in a table (significantly different than indexes on relational columns).
  • Any nodes that match the path expression or the set of path expressions in XML that is stored in that column are indexed, and the index points to the node in storage that is linked to its parent and children for fast navigation.

XML values index
XML values index is a new type of index that can be created when users want efficient evaluation of xmlpattern expressions to improve performance during queries on XML documents. Unlike the traditional relational indexes where index keys are composed of one or more table columns specified by the user, the XML values index uses a particular XML pattern expression to index paths and values in XML documents stored in a single XML column.

The following example shows the statement to create an XML index on XML data, which defines an index on all movie titles in all documents in the XML column movieDetails as shown:

The xmlpattern is a path, which identifies the XML nodes to be indexed. It is called xmlpattern and not xpath because only a subset of the XPath language is allowed in index definitions. (Wildcards //,*, and namespaces are allowed, but XPath predicates such as /a/b[c=5] are not supported).

Since we do not require a single XML schema for all documents in an XML column, DB2 might not know which data type to use in the index for a given xmlpattern. Thus, you must specify the data type explicitly in the as sql <type> clause. The following types can be used:

  • VARCHAR(n)
  • For nodes with values of a known maximum length.
  • VARCHAR HASHED
  • For nodes with values of arbitrary length. In this case, the index contains hash values of the actual strings. Such an index can be u

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

IBM DB2 Topics