© 2005 James Bailey, François Bry, Tim Furche, and Sebastian Schaffert
The teaching material made available here is exclusively for private use and for use within REASE, the Repository of EASE for Learning Units. No part of it may be distributed in classes, or in publications, reproduced, stored in a retrieval system, or published, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the authors.
The Semantic Web aims at enriching Web data with meta-data and the processing of Web data with reasoning capabilities.
"The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users." [37].
"For the Semantic Web to function, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning." [37].
RDF [217], Topic Maps [155], OWL (formerly known as DAML-OIL) [23, 151] give rise to express relationships such as concept hierarchies or ontology, like that of the Figure 1, and application dependent relations between concepts.
Figure 1: A Concept Hierarchy or Ontology
Expressible in RDF and Topic Maps: "Book A is authored by person B."
RDFS [51] and OWL give rise to express: "No 'person' can be a 'text'."
Enabling requirement for the Semantic Web: Integrated access to both Semantic Web data (i.e. RDF, Topic Maps, and OWL data) and Web data (i.e. (X)HTML and XML data).
Access to (Semantic) Web data: Objective of (Semantic) Web query languages.
(Semantic) Web query languages range:
This presentation introduces into some of query languages proposed for the major representation formalisms of the (Semantic) Web: XML, RDF, and Topic Maps.
Query languages for OWL are not addressed in this presentation, because
The article "Web and Semantic Web Query Languages: A Survey" upon which this presentation is based introduces in more details in approximately 50 (Semantic) Web query languages.
This presentation does not aim to be a comprehensive tutorial for each of the languages it introduces into.
An XML documents consist of a document prologue and a document tree containing elements, character data and attributes, with a distinguished root element.
An element consists in an opening tag and a closing tag that enclose the element content.
An opening tag has the form
<label ...> and contains the
element's label or type and optionally a set of attributes. A
closing tag has the form
</label>, i.e. it contains only the
label.
The content of an element consists
of other elements, character data, or both (mixed content). An
element with label 'label' and without content, i.e. a so-called
empty element may be
abbreviated as <label/>.
XML documents have to be well-formed, i.e.
an interleaving of the opening and closing tags with different labels
(e.g. <b><i>Text</b></i>)
is forbidden.
The order of elements is significant. It induces the so-called document order.
Opening tags may contain key/value pairs of the form
name="value"
called attributes.
References of various kinds, especially ID/IDREF attributes and hypertext links, make possible to refer to an element instead of including it.
An XML document can be seen as a rooted, unranked, and ordered tree called document tree, if one does not consider the various referencing or linking mechanisms of XML. This interpretation is that of the data model of XML, of XPath and XQuery, and of most XML query languages. It is however too simplistic.
RDF data are sets of triples, or statements, (Subject, Property, Object). In RDF, all relations, called properties, are binary.
Nodes, i.e. subjects and objects, either
RDF data can be seen as a directed graph: Nodes are the statements' subjects and objects; arcs are the statements' properties. The URIs make graph traversals possible.
RDF data can be represented in XML or other textual formats in various manners, called serialisations.
Specific properties are predefined in the RDF and RDFS specifications [51, 148, 172, 194], e.g.:
rdf:type for specifying the type,
or class membership, of resources,
rdfs:subClassOf for specifying
class-subclass relationships between subjects/objects,
rdfs:subPropertyOf for specifying
property-subproperty relationships between properties.
rdfs:subClassOf and
rdfs:subPropertyOf are transitive and need not
be orders (or hierarchies). Inheritance in RDFS is peculiar:
rdfs:subClassOf relation,
rdfs:subClassOf relation
can be cyclic,
RDFS has meta-classes, e.g.
rdfs:Class,
the class of all classes, and rdfs:Property,
the class of all properties.
RDF allows one to define instances; RDFS allows one to define RDF Schemas or ontologies.
The main concepts of Topic Maps are topics, associations, and occurrences.
Topics might have types, what expresses memberships in classes, that are topics as well, called topic types. A topic can have one or more names.
Associations are n-ary relations (with n ≥ 2) between topics. Associations might have role and roles types.
Occurences are "information resources relevant to a topic", i.e. instances of a class. An occurrence might have one or several types characterising its "relevance to a topic", i.e. memberships into classes.
The types, i.e. classes, of an occurence are expressed by
Like RDF data, Topic Map data, short topic maps, can be seen as oriented graphs with labeled nodes and edges.
Figure 2: Sample data as a (simplified) RDF graph
Figure 2 gives an ontology with the book categories Writing, Novel, Essay, Historical Novel, and Historical Essay.
Figure 2 also gives two instances, the books The First Man in Rome (a Historical Novel authored by Colleen McCullough) and Bellum Civile (a Historical Essay authored by Julius Caesar and Aulus Hirtius, and translated by J.M. Carter).
Historical Novel is both, a Novel and an Essay.
A book may optionally have a translator, as is the case of Bellum Civile.
The RDF representation of the book recommender is given here in the Turtle serialisation [24].
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix foaf: <http://xmlns.org/foaf/0.1/> .
:Writing a rdfs:Class ;
rdfs:label "Writing" .
:Novel a rdfs:Class ;
rdfs:label "Novel" ;
rdfs:subClassOf :Writing .
:Essay a rdfs:Class ;
rdfs:label "Essay" ;
rdfs:subClassOf :Writing .
:Historical_Essay a rdfs:Class ;
rdfs:label "Historical Essay" ;
rdfs:subClassOf :Essay .
:Historical_Novel a rdfs:Class ;
rdfs:label "Historical Novel" ;
rdfs:subClassOf :Novel ;
rdfs:subClassOf :Essay .
:author a rdfs:Property ;
rdfs:domain :Writing ;
rdfs:range foaf:Person .
:translator a rdfs:Property ;
rdfs:domain :Writing ;
rdfs:range foaf:Person .
_:b1 a :Historical_Novel ;
:title "The First Man in Rome" ;
:year "1990"^^xsd:gYear ;
:author [foaf:name "Colleen McCullough"] .
_:b2 a :Historical_Essay ;
:title "Bellum Civile" ;
:author [foaf:name "Julius Caesar"] ;
:author [foaf:name "Aulus Hirtius"] ;
:translator [foaf:name "J. M. Carter"] .
Books, authors, and translators are represented by blank nodes (without identifiers), or with temporary identifiers (prefix "_:")
The Topic Map representation of the book recommender is in the Linear Topic Maps syntax [125].
/* Association and topic types for subclass-superclass hierarchy */
[superclass-subclass = "Superclass-Subclass Association Type"
@ "http://www.topicmaps.org/xtm/1.0/core.xtm#superclass-subclass" ]
[superclass = "Superclass Role Type"
@ "http://www.topicmaps.org/xtm/1.0/core.xtm#superclass" ]
[subclass = "Subclass Role Type"
@ "http://www.topicmaps.org/xtm/1.0/core.xtm#subclass" ]
/* Topic types */
[Writing = "Writing Topic Type" @ "http://example.org/books#Writing" ]
[Novel = "Novel Topic Type" @ "http://example.org/books#Novel" ]
[Essay = "Essay Topic Type" @ "http://example.org/books#Essay" ]
[Historical_Essay = "Historical Essay Topic Type"
@ "http://example.org/books#Historical_Essay" ]
[Historical_Novel = "Historical Novel Topic Type"
@ "http://example.org/books#Historical_Novel" ]
[year = "Topic Type for a Gregorian year following ISO 8601"
@ "http://www.w3.org/2001/XMLSchema#gYear" ]
[Person = "Person Topic Type" @ "http://xmlns.org/foaf/0.1/Person"]
[Author @ "http://example.org/books#author" ]
[Translator @ "http://example.org/books#translator" ]
/* Associations among the topic types */
superclass-subclass(Writing: superclass, Novel: subclass)
superclass-subclass(Writing: superclass, Essay: subclass)
superclass-subclass(Novel: superclass, Historical_Novel: subclass)
superclass-subclass(Essay: superclass, Historical_Essay: subclass)
superclass-subclass(Essay: superclass, Historical_Novel: subclass)
superclass-subclass(Person: superclass, Author: subclass)
superclass-subclass(Person: superclass, Translator: subclass)
/* Occurrence types */
[title = "Occurrence Type for Titles" @ "http://example.org/books#title" ]
/* Association types */
[author-for-book = "Association Type associating authors to books"]
[translator-for-book =
"Association Type associating translators to books"]
[publication-year-for-book =
"Association Type associating translators to books"]
/* Topics, associations, and occurrences */
[p1: Person = "Colleen McCullough"]
[p2: Person = "Julius Caesar"]
[p3: Person = "Aulus Hirtius"]
[p4: Person = "J. M. Carter"]
[b1: Historical_Essay = "Topic representing the book 'First Man in Rome'"]
author-for-book(b1, p1: author)
publication-year-for-book(b1, y1990)
{b1, title, [[The First Man in Rome]]}
[b2: Historical_Novel = "Topic representing the book 'Bellum Civile'"]
author-for-book(b2, p2: author)
author-for-book(b2, p3: author)
translator-for-book(b2, p4: translator)
{b2, title, [[Bellum Civile]]}
Subclass-superclass associations are identified using the subject identifiers of XTM [232].
For illustration purposes, the title of a book is represented as an occurrence of that book/topic.
<bookdata xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
<book type="Historical_Novel">
<title>The First Man in Rome</title>
<year type="xsd:gYear">1990</year>
<author> <name>Colleen McCullough</name> </author>
</book>
<book type="Historical_Essay">
<title>Bellum Civile</title>
<author> <name>Julius Caesar</name> </author>
<author> <name>Aulus Hirtius</name> </author>
<translator> <name>J. M. Carter</name> </translator>
</book>
<category id="Writing">
<label>Writing</label>
<category id="Novel">
<label>Novel</label>
<category id="Historical_Novel">
<label>Historical Novel</label>
</category>
</category>
<category id="Essay">
<label>Essay</label>
<category id="Historical_Essay">
<label>Historical Essay</label>
</category>
<category idref="Historical_Novel" />
</category>
</category>
</bookdata>
For the sake of brevity, the above representation does not express that authors and translators are persons.
Note the use of ID/IDREF references for expressing the book types (e.g. "Novel", "Historical_Novel").
"Selection queries" illustrates the format of answers and how related information can be selected and delivered:
Query 1: "Select all Essays together with their authors (i.e. author items and corresponding names)."
"Extraction queries" extract substructures from data graphs or tree:
Query 2: "Select all data items with any relation to the book titled 'Bellum Civile'."
"Reduction queries" sprecify what parts of the data not to include in the answer:
Query 3: "Select all data items except ontology information and translators."
It is often desirable to restructure data, possibly into different formats or different serialisations:
Query 4: "Invert the relation author (from a book to an author) into a relation authored (from an author to a book)."
RDF requires restructuring in particular for reification, which in turn is needed in RDF for making statements on statements. For example, the statement
S: "Julius Caesar is author of Bellum Civile"
is reified by the following three statements:
One distinguishes between value
aggregation like SQL's max(·),
sum(·), ... and structural
aggregation like
"how many nodes?". Query 5 uses value
aggregation, Query 6 uses structural aggregation:
Query 5: "Return the last year in which an author with name 'Julius Caesar' published something."
Query 6: "Return each of the subclasses of 'Writing', together with the average number of authors per publication of that subclass."
Related to aggregation are grouping and sorting.
It is often necessary to combine information that is not explicitly connected:
Query 7:
"Combine the information about the book
titled 'The Civil War' and authored by 'Julius Caesar'
with the information about the book with identifier
'bellum_civile'."
Combination queries are related to inference, e.g.:
"If the books entitled 'Bellum Civile' and 'The Civil War' are the same book, and if 'Julius Caesar' is an author of 'Bellum Civile', then 'Julius Caesar' is also an author of 'The Civil War' ".
Query 8:
"Return the transitive closure of the
subClassOf relation."
Not all inference queries are combination queries:
Query 9: "Return the co-author relationship between two persons that stand in author relationships with the same book."
Most query and transformation languages for XML specify the structure of the XML data to retrieve using one of:
Language already standardized, or currently in the process of standardisation by the W3C are navigational.
Many research languages are positional.
This presentation does not consider:
Figure 3: History of the XML Query Languages
The navigational languages for XML are inspired from path-based query languages designed for (relational or object-oriented) databases.
Such database query languages (e.g., GEM [286], an extension of QUEL, and OQL [73]) require fully specified paths, i.e., paths with explicitly named nodes following only parent-child connections.
OQL expresses paths with the extended dot notation introduced with GEM [286]:
SELECT b.translator.name FROM Books b
selects the name, or component, of the translator of books. There must be at most one translator per book for this expression to be legal.
Generalized (or regular) path expressions [83, 119], allow more powerful constructs, e.g., the Kleene closure operator, *.
As a consequence, generalized path expressions do not require explicit naming of all nodes along a path.
Lorel [3] is an early proposal for semistructured data, a data model introduced with the Object Exchange Model (OEM) [127, 230] and a precursor of XML.
Lorel's syntax resembles that of SQL and OQL, extending OQL's extended dot notation to generalized path expressions. Lorel provides a means for expressing:
SELECT b.translator.name FROM Books b
b.author.name
Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") against the XML sample data in Lorel (attributes are treated as sub-elements since OEM has no attributes):
select xml(results:(
select xml(result:(
select B, B.author
from bookdata.book B
where B.type = bookdata.(category.id)+
) ) ) )
Lines 1 and 2 are constructors wrapping the selected books and their authors into XML elements.
Assume that
Books having 'Julius Caesar' either as author or translator can be selected in Lorel by adding the following expression to the query after line 5:
B.(author|translator).name? = "Julius Caesar".
XPath [86, 258, 269] has been defined as part of XSL, a stylesheet language for XML (defined in replacement of SGML's stylesheet language DSSSL).
XPath provides expressions inspired from file systems for selecting data in terms of a navigation through an XML document.
XPath's Data Model: An XML document is seen
as an ordered tree with nodes for elements, attributes, character data,
namespaces declaration, comments and processing instructions.
The root of this tree is a special node which
has the node for the XML document element as (single) child.
Nodes for character data, for comments, and for processing instructions
are children of the node of the element in which they occur, or of
the root node if they are outside the document element.
A character node is always maximal, ie., it
is not allowed that two character data nodes are immediate siblings
This model resembles the XML Information Set
recommendation [94] and has become the
foundation for most activities of the W3C related to query languages.
XPath expressions: The core expressions location steps specifying where to navigate from the context node. A location step consists of three parts: an axis, a node-test, and an optional predicate.
Base Axes:
self
child
following-sibling
following
Transitive and
Transitive-Reflexive Closure Axes
of the base axis child:
descendant
descendant-or-self
Reverse Axes of the preceding axes:
parent
preceding-sibling
preceding
ancestor
ancestor-or-self
Additional Axes:
attributes
namespace
Node-tests and predicates restrict the nodes selected by an axis. They either restrict the label of the node (in case of element and attribute nodes), or the type of the node (e.g., restrict to comment children of an element).
Predicates restrict elements to some neighborhood or using functions (e.g., arithmetic or boolean operators).
Successive location steps are separated by "/" .
The XPath expression
child::book/descendant::name
expresses: "For each book child of the context
node, select its name descendants".
Comparison of XPath and Generalized Path Expressions (cf. supra 3.1.2):
It has been shown in [225] that reverse axes do not increase the expressive power of path navigations.
Without closure of arbitrary path expressions, XPath cannot express regular path expressions such as [199, 200]:
a.(b.c)*.d
meaning select d's that are reached via one
a and then arbitrary many repetitions of one
b followed by one c and
a.b*.c
Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") against the XML sample data can in XPath only be approximated as follows:
/descendant::book[attribute::type =
/descendant::category[attribute::id = "Essay"]/
descendant-or-self::category/attribute::id]
XPath always returns a single set of nodes and provides no construction. Therefore, it is not possible to return authors and their names together with the book.
The above XPath query can be expressed using XPath's abbreviated syntax:
//book[//category[//category[@::id = "Essay"]/
descendant-or-self::category/@id]
Query 2 ("Select all data items with any relation to the book titled 'Bellum Civile'.") can be expressed in (abbreviated) XPath as:
//book[title="Bellum Civile"]
XPath returns a set of nodes as result of a query, the serialization being left to the host language. Most host languages consider as results the sub-trees rooted at the selected nodes, as desired by this query. The link to the category is not expressed by means of the XML hierarchy and therefore not included in the result.
Query 3 ("Select all data items except ontology information and translators.") can be approximated in XPath (assuming 'ontology information' is identified with 'category elements'):
/bookdata//*[name(.) != "translator" and name(.) != "category"]
This query returns all descendants of the document element
bookdata the labels of which (returned by the XPath function
name) are neither translator nor category.
While this seems a convenient solution for Query 3 (the
set of nodes returned by the expression indeed does not contain
translators and categories), the link between selected book
nodes
and the excluded translators is not removed. In most host
languages translators would be included as part of
their book parent.
Queries 4 and 7-9 cannot be expressed in XPath because they require some form of construction.
Aggregations, needed by Query 5, are provided by XPath. Query 5 ("Return the last year in which an author with name 'Julius Caesar' published something.") can be expressed as follows:
max(//book[author/name="Julius Caesar"]/year)
The aggregation in Query 6 can be expressed analogously. However, Query 6 like Query 1 cannot be expressed in XPath properly due to the lack of construction.
XPathLog [203] is syntactically an extension of XPath.
XPathLog's semantics and evaluation are based on logic programming, more specifically F-Logic and FLORID [188].
XPathLog extends the syntax of XPath as follows:
//book[name ! N] ! B
B to books and N
to the names of the books.
The data model of XPathLog deviates considerably from XPath's data model: XML documents are viewed in XPathLog as edge-labeled graphs with implicit dereferencing of ID/IDREF references.
XPathLog provides means for constructing new data and for updating existing data, as well as more advanced reactive features for processing integrity constraints.
XSLT [85], the Extensible Stylesheet Language, has been conceived for transforming XML documents.
The distinction between querying and transformation has become increasingly blurred with the increase in expressiveness of query and transformation languages.
XSLT uses an XML syntax with embedded XPath expressions.
An XSLT program, called stylesheet is composed of one or more transformation rules called templates.
Templates recursively operate on a single input document.
A template specifies:
XSLT allows also recursive templates but recursion is limited: Except for templates constructing strings only, the result of a template is immutable (a so-called result tree fragment) and, except for literal copies, cannot be input for further templates.
As a consequence, no views can be defined in XSLT.
Thanks to its string processing capabilities, XSLT is Turing complete [169].
XSLT 2.0 [167] aims at overcoming limitations of XSLT (e.g. single input document, restricted recursion, no support for XML Schema, limited support for namespaces, lack of specific grouping constructs)
All sample queries can be expressed in XSLT.
Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") can be expressed in XSLT as follows:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<results>
<xsl:apply-templates select="//book[@type =
//category[@id = 'Essay']/descendant-or-self::category/@id]"/>
</results>
</xsl:template>
<xsl:template match="book">
<result>
<xsl:copy select = "."/>
<xsl:apply-templates select="author|author/name" />
</result>
</xsl:template>
<xsl:template match="author|author/name">
<xsl:copy-of select="." />
</xsl:template>
</xsl:stylesheet>
This stylesheet can be evaluated as follows:
<results> element and within it
try to recursively apply the templates to all nodes matched
by the XPath expression in the select attribute of the
apply-templates statement in line 5.
book elements matched by the second
template which creates a <result> element,
makes a shallow copy of itself and recursively applies the
rules to the bookN's author children and their
name children.
author or name of an
author, copy the complete input to the result.
Query 2 and 5 to 8 are omitted as their implementations in XSLT are similar to that of other queries.
Aside from templates, XSLT also provides
explicit iteration,
selection, and
assignment:
xsl:for-each, xsl:if,
xsl:variable among others.
Using these constructs one can also formulate
Query 1
("Select all Essays together with
their authors, i.e. author items and corresponding
names.")
as follows:
<results xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:for-each select="//book[@type = //category[@id = 'Essay']/
descendant-or-self::category/@id]">
<result>
<xsl:copy select = "."/>
<xsl:for-each select = "author|author/name">
<xsl:copy-of select="." />
</xsl:for-each>
</result>
</xsl:for-each>
</results>
The xsl:for-each expressions iterate over the
elements selected by the XPath expression in their select
attribute. Aside from the expressions for copying input, this
realisation of Query 1 in XSLT closely resembles its
implementation in XQuery given in the following section.
The programming style of the first implementation of Query 1 is often called rule-based, that of the second implementation of Query 1, "fill-in-the-blanks".
Query 3 ("Select all data items except ontology information and translators.") can be expressed in XSLT as follows:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="translator | category" />
</xsl:stylesheet>
The first template specifies that for all attributes and nodes, the node itself is copied and their (attribute and node) children are processed recursively.
The second template specifies that for translators and category elements, nothing is generated (and their children are not processed).
The first template also matches translator and category elements. For such a case where multiple templates match, XSLT uses detailed conflict resolution policies [85]: The second template is chosen as it is more specific than the first one.
Query 4 ("Invert the relation author (from a book to an author) into a relation authored (from an author to a book).") can be expressed in XSLT as follows:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<bookdata>
<xsl:apply-template
select="//author[not(name = preceding::author/name)]" />
</bookdata>
</xsl:template>
<xsl:template match="author">
<person>
<name><xsl:value-of select="name" /></name>
<authored>
<xsl:apply-templates
select="//book[author/name=current()/name]" />
</authored>
</person>
</xsl:template>
<xsl:template match="book">
<book>
<xsl:copy-of select="@*" />
<xsl:copy-of select="*[name() != 'author']" />
</book>
</xsl:template>
</xsl:stylesheet>
The preceding axis from XPath is used to avoid
duplicates in the result. The function current()
(in the second template) returns the current node, here,
the author element last matched by the second
template.
(This function is essentially syntactic sugar to limit the use
of variables, cf. implementation of Query 9).
Query 9 ("Return the co-author relationship between two persons that stand in author relationships with the same book.") can be expressed in XSLT as follows:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<results>
<xsl:for-each select="//author">
<xsl:variable name="author" select="." />
<xsl:for-each select="$author/following-sibling::author">
<co-authors>
<name> <xsl:value-of select="$author/name" /> </name>
<name> <xsl:value-of select="current()/name" /> </name>
</co-authors>
</xsl:for-each>
</xsl:for-each>
</results>
</xsl:template>
</xsl:stylesheet>
This implementation of Query 9 in XSLT is very similar of the implementation of Query 9 in XQuery given below.
The implementation of Query 9 given above uses xsl:for-each,
as the inner and the outer author are processed differently.
A solution without xsl:for-each is possible but requires
parameterized templates and named or grouped templates:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<results>
<xsl:apply-template select="//author" />
</results>
</xsl:template>
<xsl:template match="author">
<xsl:apply-template select="following-sibling::author"
mode="co-author">
<xsl:with-param name="first-co-author" select="." />
</xsl:apply-templates>
</xsl:template>
<xsl:template match="author" mode="co-author">
<xsl:param name="first-co-author" />
<co-authors>
<name> <xsl:value-of select="$first-co-author/name" /> </name>
<name> <xsl:value-of select="name" /> </name>
</co-authors>
</xsl:template>
</xsl:stylesheet>
For simplicity, none of the two implementations of Query 9 given above eliminate duplicates (occurring if if two persons are co-authors of several books).
An activity towards specifying an XML query language, now XQuery, has been launched by W3C shortly before the publication of the final XPath 1.0 and XSLT 1.0 recommendations.
Requirements and use cases have been given in [76, 77, 192]. Proposals, e.g., XQL and Quilt, have been made that have influenced XQuery [42]:
XQuery's developement is not completed. It represents the state-of-the-art of navigational XML query languages.
XQuery Principles: XQuery is an extension of XPath 2.0 adding:
for clauses, an optional
where clause, an optional
order by, and a return
clause.
in part:
for $book in //book
books selected by the path expression
//book.
where clause
specifies conditions on the selected
data items.
order by clause allows
the items to be
processed in a certain order.
return clause specifies
the result of the entire FLWOR expression (often using
constructors as shown above).
for clauses,
let clauses that also
bind variables but without iterating over the individual data
items in the sequence bound to the variable.
unordered keyword indicating that the
order of elements in
sequences that are constructed or returned as result of
XQuery expressions is not relevant. E.g.,
unordered{for $book in //book return $book/name}
//book may be processed in any order in the
for
clause and the order of the resulting name nodes also can be
arbitrary (i.e. implementation dependent). Within
unordered query parts, the result of expressions querying
the order of elements in sequences such as fn:position or
fn:last is non-deterministic.
validate expression.
Not all of XPath's axes are mandatory in XQuery implementations:
ancestor, ancestor-or-self, following,
following-sibling, preceding, and
preceding-sibling do not have to be supported.
This is no restriction to XQuery's expressiveness, as expressions using
reverse axes can be rewritten [225], and the
"horizontal axes", e.g., following and
following-sibling, can be replaced by FLWOR expressions using the
<< and >> operators that compare two nodes with
respect to their position in a sequence.
Sample Queries: All nine sample queries can be expressed in XQuery. Query 2 is omitted because it can be expressed as a simplification of of Query 1. Query 5 can be expressed in XPath. Query 8 and 9 are similar. Sincve Query 9 in XQuery exhibits an interesting anomaly, it is given below and Query 8 is not given.
Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") can be expressed in XQuery as follows (interpreting the phrase an essay as a book with type attribute equal to the id of the category Essay or one of its sub-categories represented as descendants in the XML structure):
<results> {
let $doc := doc("http://example.org/books")/bookdata
let $sub-of-essay :=
$doc//category[@id="Essay"]/descendant-or-self::category
for $book in $doc//book
where $book/@type = $sub-of-essay/@id
return
<result>
{ $book }
{ $book/author }
{ $book/author/name }
</result> }
</results>
Note the use of the let clause in line 2: the sequence of all
sub-categories of the category with id Essay
including that category itself (we use the reflexive transitive axis
descendant-or-self) is bound to the variable. However,
in contrast to a for expression, this
sequence is not iterated over. Instead of the
where clause in line 4 a predicate could be
added to the path expression in line 3 resulting in the expression
$doc//book[@type = $sub-of-essay/@id].
Query 3 ("Select
all data items except ontology information and
translators.") requires
structural recursion while constructing
new elements that are identical to the
ones encountered, except omitting translator and category
nodes. The following implementation shows the use of a
user-defined, recursive function that copies the tree rooted at
its first parameter $e, except all nodes in the sequence given
as second parameter.
declare function
local:tree-except($e as element(),
$exceptions as node()*) as element()*
{
element {fn:node-name($e)} {
$e/@* except $exceptions, (: copy the attributes :)
for $child in $element/node() except $exceptions
return
if $child instance of element()
(: for elements process them recursively :)
local:tree-except($section)
else (: others (text, comments, etc. copy :)
$child
}
};
document {
let $doc := doc("http://example.org/books")/bookdata
let $exceptions := $doc//translator union $doc//category
local:tree-except($doc, $exceptions)
}
Note the typing of the parameters: the first parameter is a single element, the second, a sequence of nodes and the function returns a sequence of elements. In the main part of the query, the document constructor is used to indicate that its content is to be the document element of the constructed tree.
Query 4 ("Invert the relation author (from a book to an author) into a relation authored (from an author to a book).")
<bookdata> {
let $a := doc("http://example.org/books")//author
for $name in distinct-values($a/name)
return
<person>
<name> { $name } </name>
<authored
{
for $b in doc("http://example.org/books")//book
where some $ba in $b/author
satisfies $ba/name = $name
return
<book> { $b/@*, $b/* except $b/author } </book>
}
</authored
</person>
}
</bookdata>
This implementation is in fact similar to the implementation of use case XMP-Q4 in [76]. It exhibits two noteworthy functionalities:
distinct-value in line 3 to avoid duplication
in the result, if an author occurs several times in
the document.
books where some (i.e. at least one) of the
authors have the same name as the
currently considered author.
Using aggregation expressions (see lines 8 and 10), Query 6 ("Return each of the subclasses of 'Writing', together with the average number of authors per publication of that subclass.") can be expressed in XQuery as follows:
<results> {
let $doc := doc("http://example.org/books")/bookdata
for $category in $doc//category[@id="Essay"]//category
return
<category>
{ $category/@id }
<average-number-of-authors>{
fn:avg(for $book in $doc//book
where @type = $category/@id
return fn:count($book/author))
}
</average-number-of-authors>
</category>
}
</results>
Combining data can be expressed in a very compact manner in
XQuery, as the following expression of
Query 7
("Combine the information about the
book titled 'The Civil War' and authored by 'Julius Caesar'
with the information about the book with identifier
'bellum_civile'.")
shows:
<book>
{ for $book in doc("http://example.org/books")//book
where title="Bellum Civile" and author/name="Julius Caesar"
return ($book/@*, $book/*)
}
{
for $book in doc("http://example.org/books")//book
where @id="bellum_civile"
return ($book/@*, $book/*)
}
</book>
Query 9 ("Return the co-author relationship between two persons that stand in author relationships with the same book.") can be expressed in XQuery as follows:
<results>
{ let $doc := doc("http://example.org/books")
for $book in doc("http://example.org/books")//book
for $author in $book/author
for $co-author in $book/author
where $author << $co-author
return
<co-authors>
<name> { $author/name } </name>
<name> { $co-author/name } </name>
</co-authors>
}
</results>
This implementation does not treat the case where two authors
co-authored multiple books. In this case, duplicates are created
by the above solution. To avoid this the following refinement
uses the before operator << in combination with a negated
condition, for specifying that only such pairs of authors should
be considered, where there is no book that occurs prior to the
currently considered one and which is also co-authored by the
current pair of authors:
<results>
{ let $doc := doc("http://example.org/books")
for $book in doc("http://example.org/books")//book
for $author in $book/author
for $co-author in $book/author
where $author << $co-author and not(
some $pb in doc("http://example.org/books")//book
satisfies ($pb << $book and
$pb//author/name = $author/name and
$pb//author/name = $co-author/name))
return
<co-authors>
<name> { $author/name } </name>
<name> { $co-author/name } </name>
</co-authors>
}
</results>
It is intuitively clear that XQuery is Turing complete since it provides recursive functions and conditional expressions. A proof of the Turing-completeness of XQuery is given in [169].
The positional approach consists in using patterns to specify the position of data within larger structures.
Positional queries mimic the data to be queried. Languages using this query-by-example style for queries mostly fall into two categories:
UnQL [64, 66], the Unstructured Query Language, has been originally developed for querying semistructured data and nested relational databases with cyclic structures. UnQL has later been adapted to querying XML, but the origins are still apparent. For example, UnQL has a non-XML syntax very similar to OEM's syntax and does not support querying or construction of ordered data.
The evaluation model and core language of UnQL is based upon structural recursion over labeled trees.
The following expression uses functional style pattern matching for selecting all books in a tree.
fun f1(T1 ∪ T2) = f1(T1) ∪ f1(T2)
| f1({ L => T }) = if L = book
then {result => book => T}
else f1(T)
| f1({}) = {}
| f1(V) = {}
UnQL's surface syntax uses
query patterns
and construction patterns: a query
consists of a single select ... where ...
or traverse rule that separate
construction from querying.
Queries may be nested, in which case the separation of querying and construction is abandoned.
Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") can be expressed in UnQL as:
select { results => {
select { result => { Book,
select { author => {
author => Author,
authorName => Name
} }
where { author => \Author } ← Book,
{ name => \Name } ← Author
where { book => \Book } ← Bib
where bookdata => Bib ← DB
A ← scopes a query pattern, i.e. it specifies that the left-hand query pattern is to be found in bindings for the right-hand variable.
The => operator is the direct edge traversal operator. E.g.
book => author
specifies that author is a direct child of
book.
Recursive traversals can be specified using regular path expressions including regular expressions over labels. E.g.
_*
traverses over arbitrary many elements with any label,
[^book]**
over arbitrary many elements with any label except book.
UnQL also provides traverse clauses for reduction and restructuring queries like Query 3 ("Select all data items except ontology information and translators."):
traverse DB given X
case translator => _ then X := {}
case category => _ then X := {}
case \L => _ then X := {l => X}
This query is evaluated by traversing the tree in the database and matching recursively each element against the three case expressions. All elements except translators and categories are copied to the newly constructed tree, structured as in the input data.
UnQL s evaluation is founded in graph simulation [66].
[64] shows that all queries expressible in UnQL can be evaluated in PTIME.
XML-QL [98, 99] is a pattern-based and rule-based query language for XML developed specifically to address the W3C's call for an XML query language (that resulted in the development of XQuery).
Like UnQL, XML-QL uses query patterns called
element patterns in [98]
in a WHERE clause.
Such patterns can be augmented by variables for selecting data.
The result of a query is specified as a
construction pattern in the
CONSTRUCT clause.
An XML-QL query always consists of a single
WHERE-CONSTRUCT rule which may be divided
into several nested subqueries.
Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") can be expressed in XML-QL as follows:
WHERE
<bookdata>
<book>
</> ELEMENT_AS $b
</>
CONSTRUCT
<results>
<result>
$b
WHERE <author>
<name> $n </>
</> ELEMENT_AS $a
CONSTRUCT $a
$n
</>
</>
Variables are preceded in XML-QL by
$. Note how the grouping
of authors with their books is expressed
using a
nested query.
Also note the tag minimization (end tags
abbreviated by </> as in SGML).
In line 4, the variable $b is restricted to data
matching the pattern in
lines 3 and 4. Such pattern restrictions are indicated in XML-QL
using the ELEMENT AS keyword.
A characteristics of XML-QL is that it uses query patterns containing several variables that may select several data items at a time instead of path selections that may only select one data item at a time.
XML-QL's variables are similar to the variables of logic programming, i.e. "joins" can be expressed by variable name equality.
Since XML-QL does not allow one to use more than one rule, one often has to employ nested subqueries to express complex queries.
Query 6 ("Return each of the subclasses of 'Writing', together with the average number of authors per publication of that subclass.") cannot be expressed in XML-QL due to lack of aggregation, in particular structural aggregation (e.g., counting the number of children of an element).
The following query returns all books classified in a sub-category of Novel:
WHERE
<book type=$Sub>
</> ELEMENT_AS $b,
<category id='Novel'>
<category* id=$Sub>
</>
</>
CONSTRUCT $b
Joins are expressed by repeated occurrences of the same variable (lines 2 and 5). In line 5 a further feature of XML-QL is shown: regular path expressions for element labels in patterns.
Transformation queries such as Query 2, except for rather local changes (e.g., omission of elements or changing labels), cannot be expressed in XML-QL.
XML-QL has no means for testing the non-existence of elements and therefore cannot express queries such as "Return all books that have no translator.".
No results on the complexity or expressiveness of XML-QL have been published.
XMAS [189], the XML Matching And Structuring language, is an XML query language developed as part of MIX [18] and builds upon XML-QL.
Like XML-QL, XMAS uses query patterns,
construction patterns, and
rules of
the form CONSTRUCT ... WHERE ....
XMAS extends XML-QL in that it provides a powerful grouping construct, instead of relying on subqueries for grouping data items within an element.
Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") can be expressed in XMAS as follows:
WHERE
<bookdata>
$B: <book>
$A: <author>
<name> $N </name>
</>
</>
</>
CONSTRUCT
<results>
<result>
$B
<book-author>
$A
<name> $N </name>
</> {$A,$N}
</> {$B}
</>
Here, one can observe the two main syntactic differences to XML-QL:
result element is
created for every instance of $B (indicated by
{$B} after the closing tag of the element
result). Within every such result
element, all authors of a
book (indicated by {$A}) are collected nested in
book-author elements (the book-author
element is necessary because grouped variables are allowed only
after closing tags or single variables in XMAS).
$B in front of the subpattern instead of XML-QL's
ELEMENT_AS $B at the end).
Grouping queries can be specified more concisely by using implicit collection labels: instead of specifying the grouping variables explicitly, all variables nested inside square brackets are considered grouping variables for that grouping, unless there is another grouping (i.e., block enclosed by square brackets) closer to the variable occurrence. Using implicit collection labels, Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") can be expressed as:
WHERE
<bookdata>
$B: <book>
$A: <author>
<name> $N </name>
</>
</>
</>
CONSTRUCT
<results>
[<result>
$B
[<book-author>
$A
<name> $N </name>
</book-author>]
</>]
</>
No results on complexity or expressiveness of XMAS have been published.
BBQ [215] is a visual interface for XMAS that allows browsing of XML data as well as authoring of XMAS queries based on a DTD of the data to be queried.
Xcerpt http://xcerpt.org [28, 60, 61, 247, 248] is a query language designed after principles given in [57] for querying both data on the "standard Web" (e.g., XML and HTML data) and data on the Semantic Web (e.g., RDF, Topic Maps, etc. data). This Section addresses using Xcerpt on the standard Web, Section 4.5.?, on the Semantic Web.
Xcerpt's Principles:
All sample queries can be expressed in Xcerpt. In the following, Query 2, 5, 7, and 8 are omitted as they are similar to other solutions shown.
Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") can be expressed in Xcerpt as follows:
GOAL
results [
all result [
var Book,
all var Author,
all var AuthorName
]
]
FROM
bookdata {{
var Book → book {{
var Author → author {{
name [ var AuthorName ] }}
}}
}}
END
In the query part (enclosed by FROM and
END), a pattern of the requested data
is specified: a bookdata element with a book
child (associated with the variable Book using the
pattern restriction operator →)
that in turn has an author child (bound to the variable
Author) with a name child whose content
is bound to the Variable AuthorName.
Double curly braces (line 10) indicate an incomplete, unordered query pattern. A matching
bookdata element may have additional children not specified in
the query and the order among the children is irrelevant for the
query. Incomplete query patterns might result in
several alternative variable bindings.
Square brackets (e.g. line 13) and in the construct part
(between GOAL and
FROM) specify that the order of the children
matters. Single brackets specify that the pattern is
complete.
Similar to XMAS, Xcerpt allows to group answers
using the constructs all and some:
all t collects all possible different instances of the
subexpression t that might result from alternative variable
bindings. Grouping may be nested.
The construct term creates a result subterm for each
alternative binding of Book, and within each such
result subterm, it groups all authors and
authornames associated with that particular book.
An Xcerpt program may contain several rules, as shown in the following solution for Query 3 ("Select all data items except ontology information and translators."):
GOAL
var Result
FROM
transform [ bookdata {{ }}, result [ var Result ] ]
END
CONSTRUCT
transform [ var Element, result [ ] ]
FROM
desc var Element → /translator|category/
END
CONSTRUCT
transform [ var Element, result [ var Label [ all var Child ] ] ]
FROM
and {
desc var Element → var Label [[ var Child ]]
where {
and { var Label != "translator", var Label != "category }
},
transform [ var Child, result [ var ChildTransformed ] ]
}
END
Xcerpt rules come in two flavors:
GOAL ... FROM ... END and
CONSTRUCT ... FROM ... END.
The first may only occur once in a program,
specifies the ultimate result of the entire program. The latter form is
used for all other rules.
Here, the two lower rules transform (recursively) an input element as
specified in the query: if it is a translator or a
category the result of the transformation is empty,
otherwise the children of the element are recursively transformed and
the result of these transformations is used to reconstruct the structure
of the input data.
The desc operator (lines 10 and 17) indicates a
pattern
that is incomplete in depth.
The where clause (line 18) restricts matches
to elements that are neither translators nor
categorys.
In line 17, a label variable is used: whereas
the variable Element is bound to the entire element matched
by the pattern, Label is bound to the label of the element,
i.e. a string such as "book".
Query 4 ("Invert the relation author (from a book to an author) into a relation authored (from an author to a book).") can be expressed in Xcerpt as follows:
GOAL
bookdata [
all person [
name [ var Name ],
authored [
all book [
all var NonAuthorChildren
] group by { var Book }
]
]
]
FROM
bookdata {{
desc var Book → book [[
author {{ name [ var Name ] }},
var NonAuthorChildren → !/author/ {{ }}
]]
}}
END
All books (at any depth) are selected together
with the names of their authors and
non-author children (a negated regular expression on the label
is used for the non-author children).
For each name of an author, a
person element is constructed (note the position of
the all in line 3) containing the name
and an authored element. In the author
element all books for that author are
nested again using all with a group by
clause naming the grouping variable.
Query 6 ("Return each of the subclasses of 'Writing', together with the average number of authors per publication of that subclass.") can be expressed in Xcerpt as follows:
GOAL
results [
all category [
attributes [ id [ var ID ] ],
average-number-of-authors [
div( count( all var Author ), count( all var Book ) )
]
]
]
FROM
bookdata {{
desc category {{ attributes {{ id [ var ID ] }} }},
desc var Book → book {{
attributes {{ type [ var ID ] }},
desc var Author → author {{ }}
}}
}}
END
The average number of authors is computed in line 6
using the structural aggregation function
count over all books and authors
for a category. The join
between the id
attribute of categorys and the type
attribute of books is expressed by repeating
the same variable.
Query 9 ("Return the co-author relationship between two persons that stand in author relationships with the same book.") can be expressed in Xcerpt as follows:
GOAL
results [
all co-authors [
name [ var Author ],
name [ var CoAuthor ]
]
]
FROM
bookdata {{
desc book {{
author {{ name {{ var Author }} }},
author {{ name {{ var CoAuthor }} }}
}}
END
This query benefits from two features of Xcerpt:
book element in line 10 are
different without requiring to explicit state that the
author and the co-author must be
different.
A visual language, called visXcerpt [29, 30], has been conceived as a visual rendering of textual Xcerpt programs.
Static type checking methods have been developed for Xcerpt [55, 283].
A declarative semantics for Xcerpt has been proposed in [63, 247].
A procedural semantics for Xcerpt has been proposed in [63] in the form of a a proof procedure. An implementation of this semantic in Haskell has been realized using Constraint Programming techniques [247].
The XQuery use case [76] has been worked out in Xcerpt (cf. [174] (in German) and [45]).
Based on Xcerpt and extending it, a reactive language called XChange [58, 59] for updates and events on the Web is currently being developed.
Figure 3: History of the RDF Query Languages
The SPARQL family consists of the four query languages SquishQL, RDQL, SPARQL, and TriQL. Common to all four languages in this family is that they "regard RDF as triple data without schema or ontology information unless explicitly included in the RDF source".
SquishQL [212, 213] aims at ease-of-use and similarity to SQL.
SquishQL relies on a query model for RDF influenced by [141].
SquishQL offers so-called triple patterns and conjunctions between triple patterns for specifying parts of RDF graphs to retrieved. "This results in quite a weak pattern language but it does ensure that in a result all variables are bound." [213].
SquishQL queries have the following form:
SELECT ?essay, ?author, ?authorName
FROM http://example.org/books
WHERE (?essay, <rdf:type>, <books:Essay>),
(?essay, <books:author>, ?author),
(?author, <books:name>, ?authorName)
USING books FOR http://example.org/books#,
rdf FOR http://www.w3.org/1999/02/22-rdf-syntax-ns#
In SquishQL, Query 1 ("Select all Essays together with their authors, i.e. author items and corresponding names.") can be expressed as follows:
SELECT ?property, ?propertyValue
FROM http://example.org/books
WHERE (?essay, <books:book-title>, "Bellum Civile")
(?essay, ?property, ?propertyValue),
USING books FOR http://example.org/books#
In SquishQL, Query 2 ("Select all data items with any relation to the book titled 'Bellum Civile'.") can (almost) be expressed as follows:
SELECT ?property, ?propertyValue
FROM http://example.org/books
WHERE (?essay, <books:book-title>, "Bellum Civile")
(?essay, ?property, ?propertyValue),
USING books FOR http://example.org/books#
A property value can be a node with other properties that an answer to Query 2 should return. Since SquishQL has no means to express recursion, such indirect properties cannot be returned by the above query if the schema of the data is unknown or recursive.
Other sample queries cannot be expressed in SquishQL.
In a SquishQL query, the AND clause
serves to express constraints on variable values. E.g. the following
query returns the URIs of persons that have authored a
book with title 'Bellum Civile':
SELECT ?person
FROM http://example.org/books
WHERE (?book, <books:author>, ?person)
(?book, <books:title>, ?title)
AND ?title = 'Bellum Civile'
An answer to an SquishQL query is a set of bindings for the variables occurring in the query.
SquishQL does not support RDFS.
RDQL, a "RDF Data Query Language", is an evolution of the SquishQL versions SquishQL [213], and Inkling [212] influenced by rdfDB [140].
RDQL has been recently submitted to the W3C for standardisation [213, 251, 253].
RDQL queries have the same form as SquishQL queries. As with SquishQL, an answer to an RDQL query is a set of bindings for the variables occurring in the query. Like SquishQL, RDQL supports only selection and extraction queries.
RDQL's authors see inferencing as a possible feature of an "RDF implementation", not of the query language RDQL: "if a graph implementation provides inferencing to appear as virtual triples (i.e. triples that appear in the graph but are not in the ground facts), then an RDQL query will include those triples as possible matches in triple patterns." [251]. As a consequence, queries referring to RDF/S relations such as type, set or class are cumbersome and/or complex.
The RDQLPlus (http://rdqlplus.sourceforge.net/) implementation of RDQL provides a language extension, called RIDIQL [284]. RIDIQL supports updates and a transparent use of the inference abilities of the Jena Toolkit [136].
SquishQL and RDQL queries cannot be composed.
Negation can be used in filters, or AND clauses
but not in WHERE clauses.
Disjunctions and optional matching cannot be expressed.
Although a variable in SquishQL and RDQL queries can be bound to blank nodes, there is no way to specify blank nodes in SquishQL's and RDQL's triple patterns. As a consequence, a query returning the blank nodes of a graph cannot be expressed in SquishQL and RDQL.
SquishQL and RDQL have no recursion and no iteration.
Only selection and extraction queries can be expressed in SquishQL and RDQL i.e. of the sample queries only Query 1 and (an approximation of) Query 2.
Like SquishQL, RDQL does not support RDFS concepts, although its
implementation in the Jena Toolkit [136]
supports the transitive closures of the
RDFS relations rdfs:subClassOf and
rdfs:subPropertyOf.
No formal semantics has been defined for SquishQL or RDQL.
The complexity of SquishQL and RDQL has not been investigated so far.
SPARQL [239], a "Query Language for RDF" formerly called SBrQL [238], has been developed by members of the W3C RDF Data Access Working Group. SPARQL is an extension of RDQL designed according to requirements and use cases [87] and is still under development.
SPARQL extends RDQL with facilities to:
CONSTRUCT clauses,
one new RDF graph with data from the RDF graph queried.
Like RDQL queries, the new graph can be specified with
triple patterns
or graph patterns.
DESCRIBE clauses,
"descriptions" of the resources
matching the query part. The exact meaning of "description" is not
yet defined.
OPTIONAL triple or graph query
patterns.
SPARQL queries have the following form:
PREFIX |
Specification of a name for a URI (like RDQL's USING) |
SELECT |
Returns all or some of the variables bound in the WHERE clause. |
CONSTRUCT |
Returns a RDF graph with all or some of the variable bindings. |
DESCRIBE |
Returns a description of the resources found. |
ASK |
Returns whether a query pattern matches or not |
WHERE |
list, i.e. conjunction, of query (triple or graph) patterns |
OPTIONAL |
list, i.e. conjunction, of optional (triple or graph) patterns |
AND |
boolean expression (the filter to be applied to the result) |
An extension of Query 1
("Select all Essays together with
their authors, i.e. author items and corresponding
names.") returning the translators
of a book, if there are some, can be expressed
in SPARQL as follows:
PREFIX books: http://example.org/books#
PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
SELECT ?essay, ?author, ?authorName, ?translator
FROM http://example.org/books
WHERE (?essay books:author ?author),
(?author books:authorName ?authorName)
OPTIONAL (?essay books:translator ?translator)
Using the CONSTRUCT clause,
restructuring and
non-recursive inference queries can
be expressed in SPARQL.
Query 4 ("Invert the relation author (from a book to an author) into a relation authored (from an author to a book).") can be expressed in SPARQL as follows:
PREFIX books: http://example.org/books# CONSTRUCT (?y books:authored ?x) FROM http://example.org/books WHERE (?x books:author ?y)
Query 9 ("Return the co-author relationship between two persons that stand in author relationships with the same book.") can be expressed in SPARQL as follows:
PREFIX books: http://example.org/books#
CONSTRUCT (?x books:co-author ?y)
FROM http://example.org/books
WHERE (?book books:author ?x)
(?book books:author ?y)
AND (?x neq ?y)
TriQL extends RDQL by constructs supporting querying of named graphs [72], as introduced in TriG [40]. Named graphs allow one to filter RDF statements after their sources or authors, like in the following query: "Return the books with rating above a threshold of 5, using only information asserted by Marcus Tullius Cicero." This can be expressed in TriQL as follows:
SELECT ?books
WHERE ?graph ( ?books books:rating ?rating )
(?graph swp:assertedBy ?warrant)
(?warrant swp:authority <http://people.net/cicero>)
USING books FOR http://example.org/books#,
swp FOR <http://www.w3.org/2004/03/trix/swp-1/>
The RQL family consists of RQL, SeRQL, and eRQL. Common to these languages is that they support combining data and schema querying. Furthermore, their RDF data model deviates from the standard RDF(S) data model:
This ensure a clear separation of the 3 abstraction layers of RDF and RDFS:
rdfs:Class and rdfs:Property.
RQL, the "RDF Query Language" [84, 158, 161] is the basis for SeRQL and eRQL.
Basic schema queries. A salient feature of RQL is the use of the types from RDFS schemas. The query
subClassOf(books:Writing)
returns the sub-classes of the class books:Writing
(assuming:
USING NAMESPACE books = &http://example.org/books-rdfs#).
A similar query, using subPropertyOf instead of
subClassOf,
returns the the sub-properties of a property .
The following query returns the domain ($C1) and
range ($C2) of the property author
defined at the URI named book (The prefix
$ indicates a class
variable, i.e., a variable ranging on schema classes). It
can be expressed in RQL in three different manners:
SELECT $C1, $C2 FROM {$C1}books:author{$C2}
USING NAMESPACE books = &http://example.org/books#
SELECT C1, C2 FROM Class{C1}, Class{C2}, {;C1}books:author{;C2}
USING NAMESPACE books = &http://example.org/books#
SELECT C1, C2 FROM subClassOf(domain(book:author)){C1},
subClassOf(range(books:author)){C2}
USING NAMESPACE books = &http://example.org/books#
The query
topclass(books:Historical Essay)
returns the top of the subclass-class relation, i.e.
books:Writing.
The query
nca(books:Historical Essay, books:Historical Novel)
returns the nearest common ancestor of the classes of