source: trunk/doc/src/xml-processing/xquery-introduction.qdoc@ 968

Last change on this file since 968 was 846, checked in by Dmitry A. Kuminov, 15 years ago

trunk: Merged in qt 4.7.2 sources from branches/vendor/nokia/qt.

  • Property svn:eol-style set to native
File size: 39.8 KB
Line 
1/****************************************************************************
2**
3** Copyright (C) 2011 Nokia Corporation and/or its subsidiary(-ies).
4** All rights reserved.
5** Contact: Nokia Corporation ([email protected])
6**
7** This file is part of the documentation of the Qt Toolkit.
8**
9** $QT_BEGIN_LICENSE:FDL$
10** Commercial Usage
11** Licensees holding valid Qt Commercial licenses may use this file in
12** accordance with the Qt Commercial License Agreement provided with the
13** Software or, alternatively, in accordance with the terms contained in a
14** written agreement between you and Nokia.
15**
16** GNU Free Documentation License
17** Alternatively, this file may be used under the terms of the GNU Free
18** Documentation License version 1.3 as published by the Free Software
19** Foundation and appearing in the file included in the packaging of this
20** file.
21**
22** If you have questions regarding the use of this file, please contact
23** Nokia at [email protected].
24** $QT_END_LICENSE$
25**
26****************************************************************************/
27
28/*!
29\page xquery-introduction.html
30\title A Short Path to XQuery
31
32\pagekeywords XPath XQuery
33\startpage XQuery
34\target XQuery-introduction
35
36XQuery is a language for querying XML data or non-XML data that can be
37modeled as XML. XQuery is specified by the \l{http://www.w3.org}{W3C}.
38
39\tableofcontents
40
41\section1 Introduction
42
43Where Java and C++ are \e{statement-based} languages, the XQuery
44language is \e{expression-based}. The simplest XQuery expression is an
45XML element constructor:
46
47\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 20
48
49This \c{<recipe/>} element is an XQuery expression that forms a
50complete XQuery. In fact, this XQuery doesn't actually query
51anything. It just creates an empty \c{<recipe/>} element in the
52output. But \l{Constructing Elements} {constructing new elements in an
53XQuery} is often necessary.
54
55An XQuery expression can also be enclosed in curly braces and embedded
56in another XQuery expression. This XQuery has a document expression
57embedded in a node expression:
58
59\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 21
60
61It creates a new \c{<html>} element in the output and sets its \c{id}
62attribute to be the \c{id} attribute from an \c{<html>} element in the
63\c{other.html} file.
64
65\section1 Using Path Expressions To Match And Select Items
66
67In C++ and Java, we write nested \c{for} loops and recursive functions
68to traverse XML trees in search of elements of interest. In XQuery, we
69write these iterative and recursive algorithms with \e{path
70expressions}.
71
72A path expression looks somewhat like a typical \e{file pathname} for
73locating a file in a hierarchical file system. It is a sequence of one
74or more \e{steps} separated by slash '/' or double slash '//'.
75Although path expressions are used for traversing XML trees, not file
76systems, in QtXmlPatterms we can model a file system to look like an
77XML tree, so in QtXmlPatterns we can use XQuery to traverse a file
78system. See the \l {File System Example} {file system example}.
79
80Think of a path expression as an algorithm for traversing an XML tree
81to find and collect items of interest. This algorithm is evaluated by
82evaluating each step moving from left to right through the sequence. A
83step is evaluated with a set of input items (nodes and atomic values),
84sometimes called the \e focus. The step is evaluated for each item in
85the focus. These evaluations produce a new set of items, called the \e
86result, which then becomes the focus that is passed to the next step.
87Evaluation of the final step produces the final result, which is the
88result of the XQuery. The items in the result set are presented in
89\l{http://www.w3.org/TR/xquery/#id-document-order} {document order}
90and without duplicates.
91
92With QtXmlPatterns, a standard way to present the initial focus to a
93query is to call QXmlQuery::setFocus(). Another common way is to let
94the XQuery itself create the initial focus by using the first step of
95the path expression to call the XQuery \c{doc()} function. The
96\c{doc()} function loads an XML document and returns the \e {document
97node}. Note that the document node is \e{not} the same as the
98\e{document element}. The \e{document node} is a node constructed in
99memory, when the document is loaded. It represents the entire XML
100document, not the document element. The \e{document element} is the
101single, top-level XML element in the file. The \c{doc()} function
102returns the document node, which becomes the singleton node in the
103initial focus set. The document node will have one child node, and
104that child node will represent the document element. Consider the
105following XQuery:
106
107\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 18
108
109The \c{doc()} function loads the \c{cookbook.xml} file and returns the
110document node. The document node then becomes the focus for the next
111step \c{//recipe}. Here the double slash means select all \c{<recipe>}
112elements found below the document node, regardless of where they
113appear in the document tree. The query selects all \c{<recipe>}
114elements in the cookbook. See \l{Running The Cookbook Examples} for
115instructions on how to run this query (and most of the ones that
116follow) from the command line.
117
118Conceptually, evaluation of the steps of a path expression is similar
119to iterating through the same number of nested \e{for} loops. Consider
120the following XQuery, which builds on the previous one:
121
122\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 19
123
124This XQuery is a single path expression composed of three steps. The
125first step creates the initial focus by calling the \c{doc()}
126function. We can paraphrase what the query engine does at each step:
127
128\list 1
129 \o for each node in the initial focus (the document node)...
130 \o for each descendant node that is a \c{<recipe>} element...
131 \o collect the child nodes that are \c{<title>} elements.
132\endlist
133
134Again the double slash means select all the \c{<recipe>} elements in the
135document. The single slash before the \c{<title>} element means select
136only those \c{<title>} elements that are \e{child} elements of a
137\c{<recipe>} element (i.e. not grandchildren, etc). The XQuery evaluates
138to a final result set containing the \c{<title>} element of each
139\c{<recipe>} element in the cookbook.
140
141\section2 Axis Steps
142
143The most common kind of path step is called an \e{axis step}, which
144tells the query engine which way to navigate from the context node,
145and which test to perform when it encounters nodes along the way. An
146axis step has two parts, an \e{axis specifier}, and a \e{node test}.
147Conceptually, evaluation of an axis step proceeds as follows: For each
148node in the focus set, the query engine navigates out from the node
149along the specified axis and applies the node test to each node it
150encounters. The nodes selected by the node test are collected in the
151result set, which becomes the focus set for the next step.
152
153In the example XQuery above, the second and third steps are both axis
154steps. Both apply the \c{element(name)} node test to nodes encountered
155while traversing along some axis. But in this example, the two axis
156steps are written in a \l{Shorthand Form} {shorthand form}, where the
157axis specifier and the node test are not written explicitly but are
158implied. XQueries are normally written in this shorthand form, but
159they can also be written in the longhand form. If we rewrite the
160XQuery in the longhand form, it looks like this:
161
162\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 22
163
164The two axis steps have been expanded. The first step (\c{//recipe})
165has been rewritten as \c{/descendant-or-self::element(recipe)}, where
166\c{descendant-or-self::} is the axis specifier and \c{element(recipe)}
167is the node test. The second step (\c{title}) has been rewritten as
168\c{/child::element(title)}, where \c{child::} is the axis specifier
169and \c{element(title)} is the node test. The output of the expanded
170XQuery will be exactly the same as the output of the shorthand form.
171
172To create an axis step, concatenate an axis specifier and a node
173test. The following sections list the axis specifiers and node tests
174that are available.
175
176\section2 Axis Specifiers
177
178An axis specifier defines the direction you want the query engine to
179take, when it navigates away from the context node. QtXmlPatterns
180supports the following axes.
181
182\table
183\header
184 \o Axis Specifier
185 \o refers to the axis containing...
186 \row
187 \o \c{self::}
188 \o the context node itself
189 \row
190 \o \c{attribute::}
191 \o all attribute nodes of the context node
192 \row
193 \o \c{child::}
194 \o all child nodes of the context node (not attributes)
195 \row
196 \o \c{descendant::}
197 \o all descendants of the context node (children, grandchildren, etc)
198 \row
199 \o \c{descendant-or-self::}
200 \o all nodes in \c{descendant} + \c{self}
201 \row
202 \o \c{parent::}
203 \o the parent node of the context node, or empty if there is no parent
204 \row
205 \o \c{ancestor::}
206 \o all ancestors of the context node (parent, grandparent, etc)
207 \row
208 \o \c{ancestor-or-self::}
209 \o all nodes in \c{ancestor} + \c{self}
210 \row
211 \o \c{following::}
212 \o all nodes in the tree containing the context node, \e not
213 including \c{descendant}, \e and that follow the context node
214 in the document
215 \row
216 \o \c{preceding::}
217 \o all nodes in the tree contianing the context node, \e not
218 including \c{ancestor}, \e and that precede the context node in
219 the document
220 \row
221 \o \c{following-sibling::}
222 \o all children of the context node's \c{parent} that follow the
223 context node in the document
224 \row
225 \o \c{preceding-sibling::}
226 \o all children of the context node's \c{parent} that precede the
227 context node in the document
228\endtable
229
230\section2 Node Tests
231
232A node test is a conditional expression that must be true for a node
233if the node is to be selected by the axis step. The conditional
234expression can test just the \e kind of node, or it can test the \e
235kind of node and the \e name of the node. The XQuery specification for
236\l{http://www.w3.org/TR/xquery/#node-tests} {node tests} also defines
237a third condition, the node's \e {Schema Type}, but schema type tests
238are not supported in QtXmlPatterns.
239
240QtXmlPatterns supports the following node tests. The tests that have a
241\c{name} parameter test the node's name in addition to its \e{kind}
242and are often called the \l{Name Tests}.
243
244\table
245\header
246 \o Node Test
247 \o matches all...
248 \row
249 \o \c{node()}
250 \o nodes of any kind
251 \row
252 \o \c{text()}
253 \o text nodes
254 \row
255 \o \c{comment()}
256 \o comment nodes
257 \row
258 \o \c{element()}
259 \o element nodes (same as star: *)
260 \row
261 \o \c{element(name)}
262 \o element nodes named \c{name}
263 \row
264 \o \c{attribute()}
265 \o attribute nodes
266 \row
267 \o \c{attribute(name)}
268 \o attribute nodes named \c{name}
269 \row
270 \o \c{processing-instruction()}
271 \o processing-instructions
272 \row
273 \o \c{processing-instruction(name)}
274 \o processing-instructions named \c{name}
275 \row
276 \o \c{document-node()}
277 \o document nodes (there is only one)
278 \row
279 \o \c{document-node(element(name))}
280 \o document node with document element \c{name}
281\endtable
282
283\target Shorthand Form
284\section2 Shorthand Form
285
286Writing axis steps using the longhand form with axis specifiers and
287node tests is semantically clear but syntactically verbose. The
288shorthand form is easy to learn and, once you learn it, just as easy
289to read. In the shorthand form, the axis specifier and node test are
290implied by the syntax. XQueries are normally written in the shorthand
291form. Here is a table of some frequently used shorthand forms:
292
293\table
294\header
295 \o Shorthand syntax
296 \o Short for...
297 \o matches all...
298 \row
299 \o \c{name}
300 \o \c{child::element(name)}
301 \o child nodes that are \c{name} elements
302
303 \row
304 \o \c{*}
305 \o \c{child::element()}
306 \o child nodes that are elements (\c{node()} matches
307 \e all child nodes)
308
309 \row
310 \o \c{..}
311 \o \c{parent::node()}
312 \o parent nodes (there is only one)
313
314 \row
315 \o \c{@*}
316 \o \c{attribute::attribute()}
317 \o attribute nodes
318
319 \row
320 \o \c{@name}
321 \o \c{attribute::attribute(name)}
322 \o \c{name} attributes
323
324 \row
325 \o \c{//}
326 \o \c{descendant-or-self::node()}
327 \o descendent nodes (when used instead of '/')
328
329\endtable
330
331The \l{http://www.w3.org/TR/xquery/}{XQuery language specification}
332has a more detailed section on the shorthand form, which it calls the
333\l{http://www.w3.org/TR/xquery/#abbrev} {abbreviated syntax}. More
334examples of path expressions written in the shorthand form are found
335there. There is also a section listing examples of path expressions
336written in the \l{http://www.w3.org/TR/xquery/#unabbrev} {longhand
337form}.
338
339\target Name Tests
340\section2 Name Tests
341
342The name tests are the \l{Node Tests} that have the \c{name}
343parameter. A name test must match the node \e name in addition to the
344node \e kind. We have already seen name tests used:
345
346\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 19
347
348In this path expression, both \c{recipe} and \c{title} are name tests
349written in the shorthand form. XQuery resolves these names
350(\l{http://www.w3.org/TR/xquery/#id-basics}{QNames}) to their expanded
351form using whatever
352\l{http://www.w3.org/TR/xquery/#dt-namespace-declaration} {namespace
353declarations} it knows about. Resolving a name to its expanded form
354means replacing its namespace prefix, if one is present (there aren't
355any present in the example), with a namespace URI. The expanded name
356then consists of the namespace URI and the local name.
357
358But the names in the example above don't have namespace prefixes,
359because we didn't include a namespace declaration in our
360\c{cookbook.xml} file. However, we will often use XQuery to query XML
361documents that use namespaces. Forgetting to declare the correct
362namespace(s) in an XQuery is a common cause of XQuery failures. Let's
363add a \e{default} namespace to \c{cookbook.xml} now. Change the
364\e{document element} in \c{cookbook.xml} from:
365
366\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 23
367
368to...
369
370\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 24
371
372This is called a \e{default namespace} declaration because it doesn't
373include a namespace prefix. By including this default namespace
374declaration in the document element, we mean that all unprefixed
375\e{element} names in the document, including the document element
376itself (\c{cookbook}), are automatically in the default namespace
377\c{http://cookbook/namespace}. Note that unprefixed \e{attribute}
378names are not affected by the default namespace declaration. They are
379always considered to be in \e{no namespace}. Note also that the URL
380we choose as our namespace URI need not refer to an actual location,
381and doesn't refer to one in this case. But click on
382\l{http://www.w3.org/XML/1998/namespace}, for example, which is the
383namespace URI for elements and attributes prefixed with \c{xml:}.
384
385Now when we try to run the previous XQuery example, no output is
386produced! The path expression no longer matches anything in the
387cookbook file because our XQuery doesn't yet know about the namespace
388declaration we added to the cookbook document. There are two ways we
389can declare the namespace in the XQuery. We can give it a \e{namespace
390prefix} (e.g. \c{c} for cookbook) and prefix each name test with the
391namespace prefix:
392
393\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 3
394
395Or we can declare the namespace to be the \e{default element
396namespace}, and then we can still run the original XQuery:
397
398\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 4
399
400Both methods will work and produce the same output, all the
401\c{<title>} elements:
402
403\snippet snippets/code/doc_src_qtxmlpatterns.qdoc 5
404
405But note how the output is slightly different from the output we saw
406before we added the default namespace declaration to the cookbook file.
407QtXmlPatterns automatically includes the correct namespace attribute
408in each \c{<title>} element in the output. When QtXmlPatterns loads a
409document and expands a QName, it creates an instance of QXmlName,
410which retains the namespace prefix along with the namespace URI and
411the local name. See QXmlName for further details.
412
413One thing to keep in mind from this namespace discussion, whether you
414run XQueries in a Qt program using QtXmlPatterns, or you run them from
415the command line using xmlpatterns, is that if you don't get the
416output you expect, it might be because the data you are querying uses
417namespaces, but you didn't declare those namespaces in your XQuery.