Copyright © 2017 the Contributors to the ShapeMap Structure and Language Specification, published by the Shape Expressions Community Group under the W3C Community Contributor License Agreement (CLA). A human-readable summary is available.
The Shape Expressions (ShEx) language describes RDF nodes and graph structures. A node constraint describes an RDF node (IRI, blank node or literal) and a shape describes the triples involving nodes in an RDF graph. The ShapeMap language associates RDF nodes with ShEx shapes. These associations can be used to state candidate shape maps as an input to the validation process. They can be the output of a validation process, where the ShEx engine reports the conformance of RDF nodes with respect to ShEx shapes.
This document defines the ShapeMap language. See the Shape Expressions Primer for an introduction to ShEx validation and the Shape Expressions Language for a formal definition of ShEx.
This specification was published by the Shape Expressions Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.
This document has been developed by the Shape Expressions Community Group.
This version is an initial editor's proposal to the CG.
If you wish to make comments regarding this document, please send them to public-shex@w3.org (subscribe, archives).
This document assumes an understanding of the ShEx notation and terminology.
ShExMap uses the following terms from RDF semantics [rdf11-mt]:
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
Conformance criteria are relevant to authors and authoring tool implementers. As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
ShapeMap: a set of shape associations. Each shape association has at least two members: a nodeSelector and a shapeLabel, and when used for the result of validation, may have any of status, reason, or appInfo:
START
" for the start shape expression.
In this document, these members can be addressed with a '.
' operator. For instance, a shape association A
would have an A.nodeSelector
member.
If the status member is absent, the status is assumed to be "conformant". The reason and appInfo members may also be absent but have no default value.
A triple pattern has a subject pattern, predicate IRI and object pattern.
A focus selector identifies the slot (subject or object) to be validated. A wildcard indicates that the slot may hold any value. A triple pattern has exactly one focus selector. A triple pattern maps to a SPARQL triple pattern with the following restrictions:
V
(the set of variables) is either a fresh variable or a known token to identify the focus node.I
in the SPARQL definitions).A query ShapeMap is a ShapeMap in which each shape association has only the members nodeSelector and shapeLabel.
A fixed ShapeMap is a query ShapeMap in which each nodeSelector is an RDF node. The ShEx validation process takes as input a fixed ShapeMap.
A result ShapeMap is a fixed ShapeMap with the addition of optional members status, reason and appInfo.
No two shape associations in a ShapeMap may have the same combination of nodeSelector and shapeLabel.
ShapeMaps are designed to express the goal or the result of validating an RDF node against a ShEx schema:
A query ShapeMap is converted to a fixed ShapeMap to be used as the input to the validation process. This process takes as input a query ShapeMap and a graph and produces a fixed ShapeMap. For a query ShapeMap Q
and a graph G
,
for each shape association A
in Q
:
A
.nodeSelector is an RDF node, A
is in the fixed ShapeMap.A
.nodeSelector is a triple pattern, let P
be a SPARQL Triple Pattern where
A
's subject is a focus selector or a wildcard, P
's subject
is a fresh variable, otherwise P
's subject is A
's subject.P
's predicate is A
's predicate.A
's object is a focus selector or a wildcard, P
's object is
a fresh variable, otherwise P
's subject is A
's object.T
in G
which matches P
, the fixed ShapeMap has a shape association F
where F
.shapeLabel
= A
.shapeLabel and
A
's subject is a focus selector, F
.nodeSelector is T
's subject.A
's object is a focus selector, F
.nodeSelector is T
's object.ShapeMaps can be easily transmitted and understood with a specialized syntax.
A query ShapeMap can include shape associations with both RDF nodes and triple patterns.
rdf:type
(<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
) property.Production numbers followed by a letter correspond to productions in other grammars:
[1 ] | shapeMap |
::= | shapeAssociation (',' shapeAssociation)* |
[2 ] | shapeAssociation |
::= | nodeSelector shapeLabel |
[3 ] | nodeSelector |
::= | objectTerm | triplePattern |
[4 ] | subjectTerm |
::= | iri | 'a' |
[5 ] | objectTerm |
::= | subjectTerm | literal |
[6 ] | triplePattern |
::= | '{' "FOCUS" iri (objectTerm | '_') '}' |
[7 ] | shapeLabel |
::= | '@' (iri | "START") | AT_START |
[13t] | literal |
::= | rdfLiteral | numericLiteral | booleanLiteral |
[16t] | numericLiteral |
::= | INTEGER | DECIMAL | DOUBLE |
[65x] | rdfLiteral |
::= | langString | string ("^^" iri)? |
[134s] | booleanLiteral |
::= | "true" | "false" |
[135s] | string |
::= | STRING_LITERAL1 | STRING_LITERAL_LONG1 |
[66x] | langString |
::= | LANG_STRING_LITERAL1 | LANG_STRING_LITERAL_LONG1 |
[136s] | iri |
::= | IRIREF |
TerminalsText is matched against the longest matching terminal. The PASSED TOKENS below may appear between any terminals or literal strings which appear in the grammar above. |
|||
[18t] | <IRIREF > |
::= | "<" ([^#0000- <>\"{}|^`\\] | UCHAR)* ">" |
[142s] | <BLANK_NODE_LABEL > |
::= | "_:" (PN_CHARS_U | [0-9]) ((PN_CHARS | ".")* PN_CHARS)? |
[17] | <AT_START > |
::= | "@START" |
The <AT_START > terminal has precendence over LANGTAG |
|||
[145s] | <LANGTAG > |
::= | "@" ([a-zA-Z])+ ("-" ([a-zA-Z0-9])+)* |
[19t] | <INTEGER > |
::= | [+-]? [0-9]+ |
[20t] | <DECIMAL > |
::= | [+-]? [0-9]* "." [0-9]+ |
[21t] | <DOUBLE > |
::= | [+-]? ([0-9]+ "." [0-9]* EXPONENT | "."? [0-9]+ EXPONENT) |
[155s] | <EXPONENT > |
::= | [eE] [+-]? [0-9]+ |
[156s] | <STRING_LITERAL1 > |
::= | "'" ([^'\\\n\r] | ECHAR | UCHAR)* "'" |
[157s] | <STRING_LITERAL2 > |
::= | '"' ([^\"\\\n\r] | ECHAR | UCHAR)* '"' |
[158s] | <STRING_LITERAL_LONG1 > |
::= | "'''" ( ("'" | "''")? ([^\\'\\] | ECHAR | UCHAR) )* "'''" |
[159s] | <STRING_LITERAL_LONG2 > |
::= | '"""' ( ('"' | '""')? ([^\"\\] | ECHAR | UCHAR) )* '"""' |
[73x] | <LANG_STRING_LITERAL1 > |
::= | "'" ([^'\\\n\r] | ECHAR | UCHAR)* "'" LANGTAG |
[74x] | <LANG_STRING_LITERAL2 > |
::= | '"' ([^\"\\\n\r] | ECHAR | UCHAR)* '"' LANGTAG |
[75x] | <LANG_STRING_LITERAL_LONG1 > |
::= | "'''" ( ("'" | "''")? ([^\\'\\] | ECHAR | UCHAR) )* "'''" LANGTAG |
[76x] | <LANG_STRING_LITERAL_LONG2 > |
::= | '"""' ( ('"' | '""')? ([^\"\\] | ECHAR | UCHAR) )* '"""' LANGTAG |
[26t] | <UCHAR > |
::= | "\\u" HEX HEX HEX HEX |
[160s] | <ECHAR > |
::= | "\\" [tbnrf\\\"\\'] |
[164s] | <PN_CHARS_BASE > |
::= | [A-Z] | [a-z] |
[165s] | <PN_CHARS_U > |
::= | PN_CHARS_BASE | "_" |
[167s] | <PN_CHARS > |
::= | PN_CHARS_U | "-" | [0-9] |
[168s] | <PN_PREFIX > |
::= | PN_CHARS_BASE ( (PN_CHARS | ".")* PN_CHARS )? |
[169s] | <PN_LOCAL > |
::= | (PN_CHARS_U | ":" | [0-9] | PLX) ( (PN_CHARS | "." | ":" | PLX)* (PN_CHARS | ":" | PLX) )? |
[170s] | <PLX > |
::= | PERCENT | PN_LOCAL_ESC |
[171s] | <PERCENT > |
::= | "%" HEX HEX |
[172s] | <HEX > |
::= | [0-9] | [A-F] | [a-f] |
[173s] | <PN_LOCAL_ESC > |
::= | "\\" ( "_" | "~" | "." | "-" | "!" | "$" | "&" | "'" | "(" | ")" | "*" | "+" | "," | ";" | "=" | "/" | "?" | "#" | "@" | "%" ) |
PASSED TOKENS |
::= | [ \t\r\n]+ |