Oracle® XML DB Developer's Guide 11g Release 2 (11.2) Part Number E23094-02 |
|
|
PDF · Mobi · ePub |
This chapter explains the fundamentals of XPath rewrite for structured (object-relational) storage in Oracle XML DB. It details the rewriting of XPath-expression arguments to various SQL functions.
This chapter contains these topics:
See Also:
"Performance Tuning for XQuery"Oracle XML DB can often optimize queries that use XPath expressions — for example, queries involving SQL functions such as XMLQuery
, XMLTable
, XMLExists
, and updateXML
, which take XPath (XQuery) expressions as arguments. The XPath expression is, in effect, evaluated against the XML document without ever constructing the XML document in memory.
This optimization is called XPath rewrite. It is a proper subset of XML query optimization, which also involves optimization of XQuery expressions, such as FLWOR expressions, that are not XPath expressions. XPath rewrite also enables indexes, if present on the column, to be used in query evaluation by the Optimizer.
The XPath expressions that can be rewritten by Oracle XML DB are a proper subset of those that are supported by Oracle XML DB. Whenever you can do so without losing functionality, use XPath expressions that can be rewritten.
XPath rewrite can occur in these contexts (or combinations thereof):
When XMLType
data is stored in an object-relational column or table (structured storage) or when an XMLType
view is built on relational data.
When you use an XMLIndex
index. See "XMLIndex".
When XMLType
data is stored as binary XML. See "How Oracle XML DB Processes XMLType Methods and SQL Functions" for information about streaming evaluation.
This chapter covers the first case: rewriting queries that use structured XML data or XMLType
views. The XMLType
views can be XML schema-based or not. Structured storage of XMLType
data is always XML schema-based. Examples in this chapter are related to XML schema-based tables.
Example 8-1 illustrates XPath rewrite for a simple query that uses an XPath expression.
SELECT po.OBJECT_VALUE FROM purchaseorder po WHERE XMLCast(XMLQuery('$p/PurchaseOrder/Requestor' PASSING po.OBJECT_VALUE AS "p" RETURNING CONTENT) AS VARCHAR2(128)) = 'Sarah J. Bell';
The XMLCast(XMLQuery...))
expression here is rewritten to the underlying relational column that stores the requestor information for the purchase order. The query is rewritten to something like the following:Foot 1
SELECT OBJECT_VALUE FROM purchaseorder p WHERE CAST (p."XMLDATA"."REQUESTOR" AS VARCHAR2(128)) = 'Sarah J. Bell';
Table 8-1 describes some XPath expressions that are rewritten during XPath rewrite.
Table 8-1 Sample of XPath Expressions that Are Rewritten to Underlying SQL Constructs
XPath Expression for Translation | Description |
---|---|
Simple XPath expressions (expressions with
|
Involves traversals over object type attributes only, where the attributes are simple scalar or object types themselves. |
Collection traversal expressions:
|
Involves traversal of collection expressions. The only axes supported are child and attribute axes. Collection traversal is not supported if the SQL function is used during a |
Predicates:
|
Predicates in the XPath are rewritten into SQL predicates. |
List index (positional predicate):
|
Indexes are rewritten to access the nth item in a collection. |
Wildcard traversals:
|
If the wildcard can be translated to one or more simple XPath expressions, then it is rewritten. |
Descendant axis (XML schema-based data only), without recursion:
|
Similar to a wildcard expression. The |
Descendant axis (XML schema-based data only), with recursion:
|
The
|
XPath functions |
Some XPath functions are rewritten. These functions include |
See Also:
"Performance Tuning for XQuery" for information about rewrite of XQuery expressionsThis section presents some guidelines for using execution plans to do the following, for queries that use XPath expressions:
Analyze query execution, to determine whether XPath rewrite occurs.
Optimize query execution, by using secondary indexes.
Use these guidelines together, taking all that apply into consideration.
As is true also for the rest of this chapter, this section is applicable only to XMLType
data that is stored object-relationally (structured storage).
XPath rewrite for object-relational storage means that a query that selects XML fragments defined by an XPath expression is rewritten to a SQL SELECT
statement on the underlying object-relational tables and columns. These underlying tables can include out-of-line tables.
See Also:
"XPath Rewrite for Out-Of-Line Tables"The execution plan of a query that has been rewritten refers to the object-relational tables and columns that underlie the queried XMLType
data.
The names of the underlying tables can be meaningful to you, if they are derived from XML element or attribute names or if the governing XML schema explicitly names them by using annotation xdb:defaultTable
. Otherwise, these names are system-generated and have no obvious meaning. In particular, they do not reflect the corresponding XML element or attribute names. Also, some system-generated columns are hidden. You do not see them if you use the SQL describe
command. They nevertheless show up in execution plans.
The plan of a query that has not been rewritten shows only the base table names, and it typically refers to user-level XML functions, such as XMLExists
. Look for this difference to determine whether a query has been optimized. The XML function name shown in an execution plan is actually the internal name (for example, XMLEXISTS2
), which is sometimes slightly different from the user-level name.
Example 8-2 shows the kind of execution plan output that is generated when Oracle XML DB cannot perform XPath rewrite. The plan here is for a query that uses SQL/XML function XMLExists
. The corresponding internal function XMLExists2
appears in the plan output, indicating that the query is not rewritten.
Example 8-2 Execution Plan Generated When XPath Rewrite Does Not Occur
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(XMLEXISTS2('$p/PurchaseOrder[User="SBELL"]' PASSING BY VALUE
SYS_MAKEXML('61687B202644E297E040578C8A175C1D',4215,"PO"."XMLEXTRA","PO"."X
MLDATA") AS "p")=1)
In this situation, Oracle XML DB constructs a pre-filtered result set based on any other conditions specified in the query WHERE
clause. It then filters the rows in this potential result set to determine which rows belong in the result set. The filtering is performed by constructing a DOM on each document and performing a functional evaluation (using the methods defined by the DOM API) to determine whether or not each document is a member of the result set.
When designing an XML schema, use annotation xdb:defaultTable
to name the underlying tables that correspond to elements that you select in queries where performance is important. This lets you easily recognize them in an execution plan, indicating by their presence or absence whether the query has been rewritten.
A query resulting from XPath rewrite sometimes includes a SQL predicate (WHERE
clause). This can happen even if the original query does not use an XPath predicate, and it can happen even if the original query does not have a SQL WHERE
clause.
When this happens, you can sometimes improve performance by creating an index on the column that is targeted by the SQL predicate, or by creating an index on a function application to that column. Example 8-1 illustrates XPath rewrite for a query that includes a WHERE
clause. Example 8-3 shows the predicate information from an execution plan for this query.
Example 8-3 Analyzing an Execution Plan to Determine a Column to Index
Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(CAST("PURCHASEORDER"."SYS_NC00021$" AS VARCHAR2(128))='Sarah J. Bell' AND SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege xmlns="http://xmlns.oracle.com/xdb/acl.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd http://xmlns.oracle.com/xdb/acl.xsd DAV:http://xmlns.oracle.com/xdb/dav.xsd "><read-properties/><read-contents/></privilege>'))=1)
The predicate information indicates that the expression XMLCast(XMLQuery...))
is rewritten to an application of SQL function cast
to the underlying relational column that stores the requestor information for the purchase order, SYS_NC0021$
. This column name is system-generated. The execution plan refers to this system-generated name, in spite of the fact that the governing XML schema uses annotation SQLName
to name this column REQUESTOR
.
Because these two names (user-defined and system-generated) refer to the same column, you can create a B-tree index on this column using either name. Alternatively, you can use the extractValue
shortcut to create the index, by specifying an XPath expression that targets the purchase-order requestor data. Example 8-4 shows these three equivalent ways to create the B-tree index on the predicate-targeted column.
Example 8-4 Creating an Index on a Column Targeted by a Predicate
CREATE INDEX requestor_index ON purchaseorder ("SYS_NC00021$"); CREATE INDEX requestor_index ON purchaseorder ("XMLDATA"."REQUESTOR"); CREATE INDEX requestor_index ON purchaseorder (extractvalue(OBJECT_VALUE, '/PurchaseOrder/Requestor'));
However, for this particular query it makes sense to create a function-based index, using a functional expression that matches the one in the rewritten query. Example 8-5 illustrates this.
Example 8-5 Creating a Function-Based Index for a Column Targeted by a Predicate
CREATE INDEX requestor_index ON purchaseorder (cast("XMLDATA"."REQUESTOR" AS VARCHAR2(128)));
Example 8-6 shows an execution plan that indicates that the index is picked up.
Example 8-6 Execution Plan Showing that Index Is Picked Up
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 524 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| PURCHASEORDER | 1 | 524 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | REQUESTOR_INDEX | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
xmlns="http://xmlns.oracle.com/xdb/acl.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd
http://xmlns.oracle.com/xdb/acl.xsd
DAV:http://xmlns.oracle.com/xdb/dav.xsd">
<read-properties/><read-contents/></privilege>'))=1)
2 - access(CAST("SYS_NC00021$" AS VARCHAR2(128))='Sarah J. Bell')
In the particular case of this query, the original functional expression applies XMLCast
to XMLQuery
to target a singleton element, Requestor
. This is a special case, where you can as a shortcut use such a functional expression directly in the CREATE INDEX
statement. That statement is rewritten to create an index on the underlying scalar data. Example 8-7, which targets an XPath expression, thus has the same effect as Example 8-5, which targets the corresponding object-relational column.
Example 8-7 Creating a Function-Based Index for a Column Targeted by a Predicate
CREATE INDEX requestor_index ON purchaseorder po (XMLCast(XMLQuery('$p/PurchaseOrder/Requestor' PASSING po.OBJECT_VALUE AS "p" RETURNING CONTENT) AS VARCHAR2(128)));
See Also:
"Indexing Non-Repeating text() Nodes or Attribute Values" for information about using the shortcut ofXMLCast
applied to XMLQuery
and the extractValue
shortcut to index singleton dataIf a collection is stored as an ordered collection table or an XMLType
instance, then you can directly access members of the collection. Each member of the collection becomes a row in a table, so you can access it directly with SQL.
You can often improve performance by indexing such collection members. You do this by creating a composite index on (a) the object attribute that corresponds to the collection XML element or its attribute and (b) pseudocolumn NESTED_TABLE_ID
.
Example 8-8 shows the execution plan for a query to find the Reference
elements in documents that contain an order for part number 717951002372 (Part
element with an Id
attribute of value 717951002372
). The collection of LineItem
elements is stored as rows in the ordered collection table lineitem_table
.
Note:
Example 8-8 does not use thepurchaseorder
table from sample database schema OE
. It uses the purchaseorder
table defined in Example 3-13. This table uses an ordered collection table (OCT) named lineitem_table
for the collection element LineItem
.Example 8-8 Execution Plan for a Selection of Collection Elements
SELECT XMLCast(XMLQuery('$p/PurchaseOrder/Reference' PASSING OBJECT_VALUE AS "p" RETURNING CONTENT) AS VARCHAR2(4000)) "Reference" FROM purchaseorder WHERE XMLExists('$p/PurchaseOrder/LineItems/LineItem/Part[@Id="717951002372"]' PASSING OBJECT_VALUE AS "p"); ------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 122 | 16 (13)| 00:00:01 | | 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 1 | 122 | 16 (13)| 00:00:01 | | 3 | SORT UNIQUE | | 1 | 50 | 14 (8)| 00:00:01 | |* 4 | TABLE ACCESS FULL | LINEITEM_TABLE | 1 | 50 | 14 (8)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | LINEITEM_TABLE_MEMBERS | 1 | | 0 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| PURCHASEORDER | 1 | 72 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("SYS_NC00009$" IS NOT NULL AND "SYS_NC00011$"='717951002372') 5 - access("NESTED_TABLE_ID"="PURCHASEORDER"."SYS_NC0003400035$")
The execution plan shows a full scan of ordered collection table lineitem_table
. This could be acceptable if there were only a few hundred documents in the purchaseorder
table, but it would be unacceptable if there were thousands or millions of documents in the table.
To improve the performance of such a query, you can create an index that provides direct access to pseudocolumn NESTED_TABLE_ID
, given the value of attribute Id
. Unfortunately, Oracle XML DB does not allow indexes on collections to be created using XPath expressions directly. To create the index, you must understand the structure of the SQL object that is used to manage the LineItem
elements. Given this information, you can create the required index using conventional object-relational SQL.
In this case, element LineItem
is stored as an instance of object type lineitem_t
. Element Part
is stored as an instance of SQL data type part_t
. XML attribute Id
is mapped to object attribute part_number
. Given this information, you can create a composite index on attribute part_number
and pseudocolumn NESTED_TABLE_ID
, as shown in Example 8-9. This index provides direct access to those purchase-order documents that have LineItem
elements that reference the required part.
Footnote Legend
Footnote 1: This example uses sample database schemaOE
and its table purchaseorder
. The XML schema for this table is annotated with attribute SQLName
to specify SQL object attribute names such as REQUESTOR
— see Example 3-10. Without such annotations, this example would use p."XMLDATA"."
Requestor
"
, not p."XMLDATA"."
.REQUESTOR
"
.