The Shape Expressions (ShEx) language describes RDF nodes and graph structures. A |node constraint| describes an RDF node (IRI, blank node or literal) and a |shape| describes the triples involving nodes in an RDF graph. These descriptions identify predicates and their associated cardinalities and datatypes. ShEx shapes can be used to communicate data structures associated with some process or interface, generate or validate data, or drive user interfaces.
This document defines the ShEx language. See the Shape Expressions Primer for a non-normative description of ShEx.
This is an editor's draft of the Shape Expressions specification. ShEx 2.x differs significantly from the W3C ShEx Submission. The July 2017 publication included a definition of validation which implied infinite recursion. This version explicitly includes recursion checks. No tests changed as a result of this and no implementations or applications are known to have been affected.
If you wish to make comments regarding this document, please raise them as GitHub issues. There are separate interfaces for specification, language and test issues. Only send comments to public-shex@w3.org (subscribe, archives) if you are unable to raise issues on GitHub. All comments are welcome.
The Shape Expressions (ShEx) language provides a structural schema for RDF data. This can be used to document APIs or datasets, aid in development of API-conformant messages, minimize defensive programming, guide user interfaces, or anything else that involves a machine-readable description of data organization and typing requirements.
ShEx describes RDF graph [[RDF11-CONCEPTS]] structures as sets of potentially connected Shapes.
These constrain the triples involving nodes in an RDF graph.
Node Constraints
constrain RDF nodes by constraining their node kind (IRI, blank node or Literal), enumerating permissible values in value sets, specifying their datatype, and constraining value ranges of Literals.
Additionally, they constrain lexical forms of Literals, IRIs and labeled blank nodes.
Shape Expressions schemas share blank nodes with the constrained RDF graphs in the same way that graphs in RDF datasets [[!rdf11-concepts]] share blank nodes.
ShEx can be represented in JSON structures (ShExJ) or a compact syntax (ShExC). The compact syntax is intended for human consumption; the JSON structure for machine processing. This document defines ShEx in terms of ShExJ and includes a section on the ShEx Compact Syntax (ShEx).
The JSON [[!rfc7159]] Syntax serves as a serializable proxy for an abstract syntax.
RDF terms are represented as JSON-LD nodes.
"http://example.org/resource"
_:
" and a blank node identifier, e.g. "_:blank3"
http://www.w3.org/2001/XMLSchema#string
are represented with the value property, e.g. { "value": "abc" }
.{ "value": "hello world", "langague": "en-US" }
{ "value": "123", "datatype": "http://www.w3.org/2001/XMLSchema#integer" }
This specification uses a JSON grammar to describe the set of JSON documents that can be interpreted as a ShEx schema.
ShEx data structures are represented as JSON objects with a member with the name "type
" (i.e. an object with a type attribute):
{ "type": "typeName", member0…n }
These are expressed in JSON grammar as typeName { member* }. RFC7159 Section 2 provides syntactic constraints for JSON — the grammar constraining those to valid ShExJ constructs is composed of:
typeName
is the name of the typed data structure.
Types are referenced in the definitions of object members and in the definitions of the semantics for those data structures.member*
is a list of zero or more terminals or references to other typeExpressions.typeExpression
is one of:
typeName
— an object of corresponding typearray
: [ typeExpression+ ]— an array of one or more JSON values matching the typeExpression.choice
: typeExpression1 | typeExpression2 | …— a choice between two or more typeExressions.?
, +
, *
following the notation in the XML specification[[!XML]] or {m,}
to indicate a that at least m
elements are required.The following examples are excerpts from the definitions below. In the JSON notation,
signifies that a Schema
has four optional components called startActs, start, imports and shapes:
signifies that a shapeExpr is one of seven object types: ShapeOr | ShapeAnd | ….
signifies that a NodeConstraint
has a nodeKind
of one of the four literals followed by any number of xsFacet and an xsFacet is either a stringFacet or a numericFacet.
ShExJ is a dialect of JSON-LD [[!JSON-LD]] and the member id is used as a node identifier. An object may be represented inline or referenced by its id which may be either a blank node or an IRI.
The JSON structure may include references to shape expressions and triple expressions:
An object with a circular reference must be referenced by an id. This example uses a nested shape reference on a value expression (defined below).
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#related", "valueExpr": "http://schema.example/#IssueShape", "min": 0 } } ] }
Not captured in this JSON syntax definition is the rule that every shapeExpr nested in a schema's shapes must have an id and no other shapeExpr may have an id. The JSON syntax definitition simplifies this by adding id:shapeExprLabel? to every shapeExpr. This example includes a nested shape. Nested shapes are not permitted to have ids.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#submittedBy", "valueExpr": { "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#name", "valueExpr": { "type": "NodeConstraint", "nodeKind": "literal" } } } } } ] }
The shape expressions in a schema are expressed as a list of shape expressions with id attributes because JSON-LD 1.0 has no provision for defining as JSON object as representing a map of URL→object. Drafts of JSON-LD 1.1 include this expressivity and, once JSON-LD 1.1 exits Candidate Recommendation, future versions of ShExJ will likely adopt it while requiring backard-compatibility with the list of shape expressions with id attributes.
JSON examples are rendered in a .json CSS style. Partial examples include ranges in a .subst CSS style to indicate text which would be substituted in a complete example. For example { "type": "ShapeAnd", "shapeExprs": [ SE1, … ] } indicates that both SE1 and … would be substituted in a complete example.
In javascript-enabled browsers, schemas with a button can be converted between the JSON representation and the compact syntax by clicking the button. The button text indicates the currently shown representation. Selecting the example and pressing "j" or "c" converts the example to the JSON (ShExJ) or compact form (ShExC). Pressing "shift J" or "shift C" converts all such examples to ShExJ or ShExC.
The validation process defined in this document relies on matching triple patterns in the form (subject, predicate, object)
where each position may be supplied by a constant, a previously defined term, or the underscore "_
", which represents a previously undefined element or wildcard.
This corresponds to a SPARQL Triple Pattern where each "_" is replaced by a unique blank node.
Matching such a triple pattern against a graph is defined by SPARQL Basic Graph Pattern Matching (BGP) with a BGP containing only that triple pattern.
ShEx validation is defined in this document by the isValid function. This process takes as input a shapes schema, an RDF graph, and a fixed ShapeMap (abbreviated as "ShapeMap" in this document). ShEx validation results may be reported as a result ShapeMap [[!shape-map]]. For illustration purposes in this specification, both the fixed ShapeMap input and the result ShapeMap output are represented in a table with four columns: node: a ShapeMap nodeSelector, shape: a ShapeMap shapeLabel, result: "pass" (for "conformant" status) or "fail" ("nonconformant" status), and an optional reason: an informal, human-readable explanation.
node | shape | result | reason |
---|---|---|---|
<node1> | <Shape1> | pass | |
<node2> | <Shape1> | fail | no ex:state supplied. |
Shape expressions are defined using terms from RDF semantics [[!rdf11-mt]]:
This specification makes use of the following namespaces:
foaf
:http://xmlns.com/foaf/0.1/
rdf
:http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs
:http://www.w3.org/2000/01/rdf-schema#
shex
:http://www.w3.org/ns/shex#
xsd
:http://www.w3.org/2001/XMLSchema#
The following functions access the elements of an RDF graph |G| containing a node |n|:
Consider the RDF graph |G| represented in Turtle:
PREFIX ex: http://schema.example/# PREFIX inst: http://inst.example/# PREFIX foaf: http://xmlns.com/foaf/ PREFIX xsd: http://www.w3.org/2001/XMLSchema# inst:Issue1 ex:state ex:unassigned ; ex:reportedBy _:User2 . _:User2 foaf:name "Bob Smith" ; foaf:mbox <mailto:bob@example.org> .
There are two arcs out of _:User2; arcsOut(|G|, _:User2):
_:User2 foaf:name "Bob Smith" . _:User2 foaf:mbox <mailto:bob@example.org> .
There is one arc into _:User2; arcsIn(|G|, _:User2):
inst:Issue1 ex:reportedBy _:User2 .
There are three arcs in the neighbourhood of _:User2 set, neigh(|G|, _:User2):
_:User2 foaf:name "Bob Smith" . _:User2 foaf:mbox <mailto:bob@example.org> . inst:Issue1 ex:reportedBy _:User2 .
Conformance criteria are relevant to authors and authoring tool implementers. As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
All ShEx documents MUST conform to the Schema Requirements. Additional constraints for the specific types of ShEx documents (ShExC, ShExJ, and ShExR) follow:
A Shape Expressions (ShEx) schema is a collection of labeled Shapes and Node Constraints. These can be used to describe or test nodes in RDF graphs. ShEx does not prescribe a language for associating nodes with shapes but several approaches are described in the ShEx Primer.
A shapes schema is captured in a Schema object:
where shapes is a mapping from shape label to shape expression.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", … }, { "id": "_:UserShape", … }, { "id": "http://schema.example/#EmployeeShape", … } ] }
ex:IssueShape { … } _:UserShape { … } ex:EmployeeShape { … }
isValid: For a graph |G|, a schema |Sch| and a fixed ShapeMap |ism|, isValid(|G|, |Sch|, |ism|)
indicates that for every (RDFnode, shapeExprLabel) pair (|n|, |s|) in |ism|, the node |n| satisfies |s|.
The latter is captured by the expression satisfies(|n|, |s|, |G|, |Sch|, completeTyping(|G|, |Sch|), |ext|)
.
(|ext| is an optional list of triples (or "neighborhood") used for recursive validation of extended Shapes.)
The function satisfies
is defined for every kind of shape expression, including |s| as a shapeExprLabel.
The validation of an RDF graph |G| against a ShEx schema |Sch| is based on the existence of completeTyping(|G|, |Sch|)
.
For an RDF graph |G| and a shapes schema |Sch|, a typing is a set of pairs of the form (|n|, |s|) where |n| is a node in |G| and |s| is a Shape that appears in some shape expression in the shapes mapping of |Sch|.
A correct typing is a |typing| such that for every RDFnode/shape pair (|n|, |s|) in |typing|, matchesShape(|n|, |s|, |G|, |Sch|, |typing|, |ext|)
holds.
completeTyping(|G|, |Sch|)
is a unique correct typing that exists for every graph and every ShEx schema that satisfies the schema requirements.
completeTyping: the definition of completeTyping(|G|, |Sch|)
is based on a stratification of |Sch|.
The number of strata of |Sch| is the number of maximal strongly connected components of the dependency graph of |Sch|.
A stratification of a schema |Sch| with |k| strata is a function stratum
that associates with every Shape in |Sch| a natural number between |1| and |k| such that:
stratum(|s1|)
= stratum(|s2|)
.
stratum(|s2|)
< stratum(|s1|)
.
The existence of a stratification for every schema is guaranteed by the negation requirement.
Given a stratification stratum
of |Sch| with |k| strata, defines inductively the series of |k| typings completeTypingOn(|1|, |G|, |Sch|)
… completeTypingOn(|k|, |G|, |Sch|)
.
completeTypingOn(|1|, |G|, |Sch|)
is the union of all correct typings that contain only RDFnode/shape pairs (|n|, |s|) with stratum(|s|)
= |1|;
completeTypingOn(|i|, |G|, |Sch|)
is the union of all correct typings that:
stratum(|s|)
≤ |i|
completeTypingOn(|i|-1, |G|, |Sch|)
when restricted to their RDFnode/shape pairs (|n1|, |s1|) for which stratum(|s1|)
< i.
Then completeTyping(|G|, |Sch|)
= completeTypingOn(|k|, |G|, |Sch|)
.
The schema |Sch| might have several different stratifications but completeTyping(|G|, |Sch|)
is the same for all these stratifications.
This property is reminiscent of the use of stratified negation in Datalog.
In order to decide isValid(Sch, G, m), it is sufficient to compute only a portion of completeTyping using an appropriate algorithm.
Popular methods for constructing the input fixed ShapeMaps can be found on https://www.w3.org/2001/sw/wiki/ShEx/ShapeMap.
A shape expression is composed of four kinds of objects combined with the algebraic operators And, Or and Not:
shapeExpr | = | ShapeOr | ShapeAnd | ShapeNot | NodeConstraint | Shape | ShapeExternal | shapeExprRef ; |
---|---|---|
ShapeOr | { | id:shapeExprLabel? shapeExprs:[shapeExpr{2,}] } |
ShapeAnd | { | id:shapeExprLabel? shapeExprs:[shapeExpr{2,}] } |
ShapeNot | { | id:shapeExprLabel? shapeExpr:shapeExpr } |
ShapeExternal | { | id:shapeExprLabel? } |
shapeExprRef | = | shapeExprLabel ; |
shapeExprLabel | = | IRIREF | BNODE ; |
Examples of shape expressions:
{ "type": "Shape", … }
{ … }
{ "type": "ShapeAnd", "shapeExprs": [
{ "type": "NodeConstraint", "nodeKind": "iri" },
{ "type": "ShapeOr", "shapeExprs": [
"http://schema.example/#IssueShape",
{ "type": "ShapeNot", "shapeExpr": { "type": "Shape", … } }
] } ] }
IRI AND ( @<http://schema.example/#IssueShape> OR NOT { … } )
In this ShapeOr's shapeExprs, "http://schema.example/#IssueShape" is a reference to the shape expression with the id "http://schema.example/#IssueShape".
satisfies: The expression satisfies(|n|, |se|, |G|, |Sch|, |t|, |ext|)
indicates that a node |n| in |G| satisfiues a shape expression |se| in |Sch| with typing |t|.
notSatisfies: Conversely, notSatisfies(|n|, |se|, |G|, |Sch|, |t|, |ext|)
indicates that |n| and |G| do not satisfy |se| with the given typing |t|.
|ext| is used when evaluating extended shapes; otherwise it is empty.
satisfies(|n|, |se|, |G|,|Sch|, |t|, |ext|)
is true if and only if:
satisfies2(|n|, |se|)
as described below in Node Constraints.
Note that testing if a node satisfies a node constraint does not require a graph or typing.satisfies(|n|, |se2|, |G|, |Sch|, |t|, |ext|)
.satisfies(|n|, |se2|, |G|, |Sch|, |t|, |ext|)
.notSatisfies(|n|, |se2|, |G|, |Sch|, |t|, |ext|)
.satisfies(|n|, |se2|, |G|, |Sch|, |t|, |ext|)
where |se2| is the shape expression with id = |se| or any non-abstract shape |se3| which directly or transitively extends |se|.
Given the three shape expressions SE1, SE2, SE3 in a Schema |Sch|, such that:
satisfies(|n|, SE1, |G|, |Sch|, |m|, |ext|)
satisfies(|n|, SE2, |G|, |Sch|, |m|, |ext|)
notSatisfies(|n|, SE3, |G|, |Sch|, |m|, |ext|)
the following hold:
{ "type": "ShapeAnd", "shapeExprs": [ SE1, SE2 ] }
{ "type": "ShapeOr", "shapeExprs": [ SE1, SE2, SE3 ] }
{ "type": "ShapeNot", "shapeExpr": { { "type": "ShapeOr", "shapeExprs": [ SE1, { "type": "ShapeAnd", "shapeExprs": [ SE2, SE3 ] } ] } } }
If |Sch|'s shapes map "http://schema.example/#shape1
" to SE1 then the following holds:
http://schema.example/#shape1"
In this example, EmployeeShape directly extends of PersonShape and transitively extends EntityShape
{"type" : "Schema", "shapes" : [ { "id" : "http://schema.example/#EntityShape", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#entityId" } }, { "id" : "http://schema.example/#PersonShape", "type" : "Shape", "extends" : [ "http://schema.example/#EntityShape" ], "expression" : { "type" : "TripleConstraint", "predicate" : "http://xmlns.com/foaf/0.1/name" } }, { "id" : "http://schema.example/#EmployeeShape", "type" : "Shape", "extends" : [ "http://schema.example/#PersonShape" ], "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#employeeNumber" } } ] }
ex:EntityShape { ex:entityId . } ex:PersonShape EXTENDS @ex:EntityShape { foaf:name . } ex:EmployeeShape EXTENDS @ex:PersonShape { ex:employeeNumber . }
In this example, IssueShape transitively references EngineerShape and ManagerShape (i.e. someone who is both an engineer and a manager):
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#approvedBy", "valueExpr": { "type": "ShapeAnd", "shapeExprs": [ "http://schema.example/#EngineerShape", "http://schema.example/#ManagerShape" ] } } }, { "id": "http://schema.example/#Engineer", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#specialty" } }, { "id": "http://schema.example/#ManagerShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#managagesDepartment" } } ] }
ex:IssueShape { ex:approvedBy @ex:EngineerShape AND @ex:ManagerShape } ex:EngineerShape { ex:specialty . } ex:ManagerShape { ex:managagesDepartment . }
NodeConstraint | { | id:shapeExprLabel? nodeKind:("iri" | "bnode" | "nonliteral" | "literal")? datatype:IRIREF? xsFacet* values:[valueSetValue+]? } |
---|---|---|
xsFacet | = | stringFacet | numericFacet ; |
stringFacet | = | (length|minlength|maxlength):INTEGER | pattern:STRING flags:STRING? ; |
numericFacet | = | (mininclusive|minexclusive|maxinclusive|maxexclusive):numericLiteral |
| | (totaldigits|fractiondigits):INTEGER ; | |
numericLiteral | = | INTEGER | DECIMAL | DOUBLE ; |
valueSetValue | = | objectValue | IriStem | IriStemRange | LiteralStem | LiteralStemRange | Language | LanguageStem | LanguageStemRange ; |
objectValue | = | IRIREF | ObjectLiteral ; |
ObjectLiteral | { | value:STRING language:STRING? type:STRING? } |
IriStem | { | stem:IRIREF } |
IriStemRange | { | stem:(IRIREF | Wildcard) exclusions:[IRIREF|IriStem+]? } |
LiteralStem | { | stem:STRING } |
LiteralStemRange | { | stem:(STRING | Wildcard) exclusions:[STRING|LiteralStem+]? } |
Language | { | languageTag:LANGTAG } |
LanguageStem | { | stem:LANGTAG } |
LanguageStemRange | { | stem:(LANGTAG | Wildcard) exclusions:[LANGTAG|LanguageStem+]? } |
Wildcard | { | /* empty */ } |
For a node n and constraint nc, satisfies2(|n|, |nc|)
if and only if for every nodeKind, datatype, xsFacet and values constraint value |v| present in nc nodeSatisfies(|n|, |v|)
.
The following sections define nodeSatisfies
for each of these types of constraints:
For a node |n| and constraint value |v|, nodeSatisfies(|n|, |v|)
if:
The following examples use a TripleConstraint object described later in the document. The
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#state", "valueExpr": { "type": "NodeConstraint", "nodeKind": "iri" } } } ] }
ex:IssueShape { ex:state IRI }
<issue1> ex:state ex:HunkyDory . <issue2> ex:taste ex:GoodEnough . <issue3> ex:state "just fine" .
node | shape | result | reason |
---|---|---|---|
<issue1> | <IssueShape> | pass | |
<issue2> | <IssueShape> | fail | expected 1 ex:state property. |
<issue3> | <IssueShape> | fail | ex:state expected to be an IRI, literal found. |
Note that <issue2> fails not because of a nodeKind violation but instead because of a Cardinality violation described below.
For a node |n| and constraint value |v|, nodeSatisfies(|n|, |v|)
if |n| is a Literal with the datatype |v| and, if |v| is in the set of SPARQL operand data types[[!sparql11-query]], an XML schema string with a value of the lexical form of |n| can be cast to the target type |v| per XPath Functions 3.1 section 19 Casting[[!xpath-functions]].
The lexical form and numeric value (where applicable) of all datatypes required by SPARQL XPath Constructor Functions MUST be tested for conformance with the corresponding XML Schema form.
ShEx extensions MAY add support for other datatypes.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#submittedOn", "valueExpr": { "type": "NodeConstraint", "datatype": "http://www.w3.org/2001/XMLSchema#date" } } } ] }
ex:IssueShape { ex:submittedOn xsd:date }
<issue1> ex:submittedOn "2016-07-08"^^xsd:date . <issue2> ex:submittedOn "2016-07-08T01:23:45Z"^^xsd:dateTime . <issue3> ex:submittedOn "2016-07"^^xsd:date .
node | shape | result | reason |
---|---|---|---|
<issue1> | <IssueShape> | pass | |
<issue2> | <IssueShape> | fail | ex:submittedOn expected to be an xsd:date , xsd:dateTime found. |
<issue3> | <IssueShape> | fail | 2016-07 is not a valid xsd:date . |
In RDF 1.1, language-tagged strings[[!rdf11-concepts]] have the datatype http://www.w3.org/1999/02/22-rdf-syntax-ns#langString
.
RDF 1.0 included RDF literals with no datatype or language tag.
These are called "simple literals" in SPARQL11[[!sparql11-query]].
In RDF 1.1, these literals have the datatype http://www.w3.org/2001/XMLSchema#string
.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://www.w3.org/2000/01/rdf-schema#label", "valueExpr": { "type": "NodeConstraint", "datatype": "http://www.w3.org/1999/02/22-rdf-syntax-ns#langString" } } } ] }
ex:IssueShape { rdfs:label rdf:langString }
<issue3> rdfs:label "emits dense black smoke"@en .
<issue4> rdfs:label "unexpected odor" .
node | shape | result | reason |
---|---|---|---|
<issue3> | <IssueShape> | pass | |
<issue4> | <IssueShape> | fail | rdfs:label expected to be an rdf:langString , xsd:string found. |
String facet constraints apply to the lexical form of the RDF Literals and IRIs and blank node identifiers (see note below regarding access to blank node identifiers).
Let |lex| =
Let |len| = the number of unicode codepoints in |lex|
For a node |n| and constraint value |v|, nodeSatisfies(|n|, |v|)
:
length
" constraints, v = len,minlength
" constraints, v >= len,maxlength
" constraints, v <= len,pattern
" constraints, |v| is unescaped into a valid XPath 3.1 regular expression[[!xpath-functions-31]] |re| and invoking fn:matches(|lex|, |re|)
returns fn:true
.
If the flags parameter is present, it is passed as a third argument to fn:matches
.
The pattern may have XPath 3.1 regular expression escape sequences per the modified production [10] in section 5.6.1.1 as well as numeric escape sequences of the form 'u' HEX HEX HEX HEX or 'U' HEX HEX HEX HEX HEX HEX HEX HEX.
Unescaping replaces numeric escape sequences with the corresponding unicode codepoint.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#submittedBy", "valueExpr": { "type": "NodeConstraint", "minlength": 10 } } } ] }
ex:IssueShape { ex:submittedBy MINLENGTH 10 }
<issue1> ex:submittedBy <http://a.example/bob> . # 20 characters <issue2> ex:submittedBy "Bob" . # 3 characters
node | shape | result | reason |
---|---|---|---|
<issue1> | <IssueShape> | pass | |
<issue2> | <IssueShape> | fail | ex:submittedBy expected to be >= 10 characters,3 characters found. |
Access to blank node identifiers may be impossible or unadvisable for many use cases. For instance, the SPARQL Query and SPARQL Update languages treat blank nodes in the query, labeled or otherwise, as variables. Lexical constraints on blank node identifiers can only be implemented in systems which preserve such labels on data import.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#submittedBy", "valueExpr": { "type": "NodeConstraint", "pattern": "genuser[0-9]+", "flags": "i" } } } ] }
ex:IssueShape { ex:submittedBy /genuser[0-9]+/i }
<issue6> ex:submittedBy _:genUser218 .
<issue7> ex:submittedBy _:genContact817 .
node | shape | result | reason |
---|---|---|---|
<issue6> | <IssueShape> | pass | |
<issue7> | <IssueShape> | fail | _:genContact817 expected to match genuser[0-9]+ . |
When expressed as JSON strings, regular expressions are subject to the JSON string escaping rules.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#ProductShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#trademark", "valueExpr": { "type": "NodeConstraint", "pattern": "^/\\t\\\\\uD835\uDCB8\\?$" } } } ] }
ex:ProductShape { ex:trademark /^\/\t\\\U0001D4B8\?$/ }
<product6> ex:trademark " \\𝒸?" . <product7> ex:trademark "\t\\\U0001D4B8?" . # Turtle literals have escape characters [tbnrf"'\]. <product8> ex:trademark "\t\\\\U0001D4B8?" .
node | shape | result | reason |
---|---|---|---|
<product6> | <ProductShape> | pass | |
<product7> | <ProductShape> | pass | |
<product8> | <ProductShape> | fail | found "\U0001D4B8 " instead of "𝒸 " (codepoint U+1D4B8). |
Numeric facet constraints apply to the numeric value of RDF Literals with datatypes listed in SPARQL 1.1 Operand Data Types[[!sparql11-query]].
Numeric constraints on non-numeric values fail.
totaldigits
and fractiondigits
constraints on values not derived from xsd:decimal
fail.
Let |num| be the numeric value of |n|.
For a node |n| and constraint value |v|, nodeSatisfies(|n|, |v|)
:
mininclusive
" constraints, v <= num,minexclusive
" constraints, v < num,maxinclusive
" constraints, v >= num,maxexclusive
" constraints, v > num,totaldigits
" constraints, |v| is less than or equals the number of digits in the XML Schema canonical form[[!xmlschema-2]] of the value of |n|,fractiondigits
" constraints, |v| is less than or equals the number of digits to the right of the decimal place in the XML Schema canonical form[[!xmlschema-2]] of the value of |n|, ignoring trailing zeros.The operators <=, <, >= and > are evaluated after performing numeric type promotion[[!xpath20]].
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#confirmations", "valueExpr": { "type": "NodeConstraint", "mininclusive": 1 } } } ] }
ex:IssueShape { ex:confirmations MININCLUSIVE 1 }
<issue1> ex:confirmations 1 . <issue2> ex:confirmations 2^^xsd:byte . <issue3> ex:confirmations 0 . <issue4> ex:confirmations "ii"^^ex:romanNumeral .
node | shape | result | reason |
---|---|---|---|
<issue1> | <IssueShape> | pass | |
<issue2> | <IssueShape> | pass | |
<issue3> | <IssueShape> | fail | 0 is less than 1 . |
<issue4> | <IssueShape> | fail | ex:romanNumeral is not a numeric datatype. |
The nodeSatisfies
semantics for NodeConstraint values depends on a nodeIn
function defined below.
For a node |n| and constraint value |v|, nodeSatisfies(|n|, |v|)
if |n| matches some valueSetValue |vsv| in |v|.
A term matches a valueSetValue if:
nodeIn: asserts that an RDF node |n| is equal to an RDF term |s| or is in a set defined by a IriStem, LiteralStem or LanguageStem.
The expression nodeIn(|n|, |s|)
is satisfied if:
NoActionIssueShape requires a state of Resolved or Rejected:
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#NoActionIssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#state", "valueExpr": { "type": "NodeConstraint", "values": [ "http://schema.example/#Resolved", "http://schema.example/#Rejected" ] } } } ] }
ex:NoActionIssueShape { ex:state [ ex:Resolved ex:Rejected ] }
<issue1> ex:state ex:Resolved .
<issue2> ex:state ex:Unresolved .
node | shape | result | reason |
---|---|---|---|
<issue1> | <NoActionIssueShape> | pass | |
<issue2> | <NoActionIssueShape> | fail | ex:state expected to be ex:Resolved or ex:Rejected , ex:Unresolved found. |
An employee must have an email address that is the string "N/A" or starts with "engineering-" or "sales-" but not "sales-contacts" or "sales-interns":
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#EmployeeShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://xmlns.com/foaf/0.1/mbox", "valueExpr": { "type": "NodeConstraint", "values": [ {"value": "N/A"}, { "type": "IriStem", "stem": "mailto:engineering-" }, { "type": "IriStemRange", "stem": "mailto:sales-", "exclusions": [ { "type": "IriStem", "stem": "mailto:sales-contacts" }, { "type": "IriStem", "stem": "mailto:sales-interns" } ] } ] } } } ] }
ex:EmployeeShape { foaf:mbox [ "N/A" <mailto:engineering->~ <mailto:sales->~ - <mailto:sales-contacts>~ - <mailto:sales-interns>~ ] }
<issue3> foaf:mbox "N/A" . <issue4> foaf:mbox <mailto:engineering-2112@a.example> . <issue5> foaf:mbox <mailto:sales-835@a.example> . <issue6> foaf:mbox "missing" . <issue7> foaf:mbox <mailto:sales-contacts-999@a.example> .
node | shape | result | reason |
---|---|---|---|
<issue3> | <EmployeeShape> | pass | |
<issue4> | <EmployeeShape> | pass | |
<issue5> | <EmployeeShape> | pass | |
<issue6> | <EmployeeShape> | fail | "missing" is not in value set. |
<issue7> | <EmployeeShape> | fail | <mailto:sales-contacts-999@a.example> is excluded. |
An employee must not have an email address that starts with "engineering-" or "sales-":
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#EmployeeShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://xmlns.com/foaf/0.1/mbox", "valueExpr": { "type": "NodeConstraint", "values": [ { "type": "IriStemRange", "stem": {"type": "Wildcard"}, "exclusions": [ { "type": "IriStem", "stem": "mailto:engineering-" }, { "type": "IriStem", "stem": "mailto:sales-" } ] } ] } } } ] }
ex:EmployeeShape { foaf:mbox [ . - <mailto:engineering->~ - <mailto:sales->~ ] }
<issue8> foaf:mbox 123 .
<issue9> foaf:mbox <mailto:core-engineering-2112@a.example> .
<issue10> foaf:mbox <mailto:engineering-2112@a.example> .
node | shape | result | reason |
---|---|---|---|
<issue8> | <EmployeeShape> | pass | |
<issue9> | <EmployeeShape> | pass | |
<issue10> | <EmployeeShape> | fail | <mailto:engineering-2112@a.example> is excluded. |
A value set can have a single value in it. This is used to indicate that a specific value is required, e.g. that an ex:state must be equal to <http://schema.example/#Resolved> or the rdf:type of some node must be foaf:Person.
Triple expressions are used for defining patterns composed of triple constraints. Shapes associate triple expressions with flags indicating whether triples match if they do not correspond to triple constraints in the triple expression. A triple expression is composed of TripleConstraint and tripleExprRef objects composed with grouping and choice operators.
Shape | { | id:shapeExprLabel? closed:BOOL? extra:[IRIREF+]? extends:[shapeExpr+]? expression:tripleExpr? semActs:[SemAct+]? annotations:[Annotation+]? } |
---|---|---|
tripleExpr | = | EachOf | OneOf | TripleConstraint | tripleExprRef ; |
EachOf | { | id:tripleExprLabel? expressions:[tripleExpr{2,}] min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } |
OneOf | { | id:tripleExprLabel? expressions:[tripleExpr{2,}] min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } |
TripleConstraint | { | id:tripleExprLabel? inverse:BOOL? predicate:IRIREF valueExpr:shapeExpr? min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } |
tripleExprRef | = | tripleExprLabel ; |
tripleExprLabel | = | IRIREF | BNODE ; |
The semantics of the matchesShape
function are based on the matches
function defined below.
Informally, evaluation of a typing for a shape requires finding a partition of the triples in the neighborhood that satisfies an EachOf comprised of the shape's expression and the expressions for each of the Shapes in its extends.
flattenTCs(|expr|) is a function which takes a ShapeExpression
or TripleExpression
and returns a set of TripleConstraints.
flattenTCs(|expr|)
returns the combined set of flattenTCs(|sexpr2|)
for each |sexpr2| in |expr|.shapeExprs.flattenTCs(|expr|)
returns flattenTCs(|expr|.shapeExprs)
.flattenTCs(|expr|)
returns Ø (the empty set).flattenTCs(|expr|)
returns flattenTCs(|refd|)
.flattenTCs(|expr|)
returns the combined set of flattenTCs(|texpr2|)
for each |texpr2| in |expr|.expressions.flattenTCs(|expr|)
returns |expr|.For a shape |s|, a list of TripleConstraint |tcs| is composed from a |s|.expression and |s|.extends as follows:.
flattenTCs(|e|)
to |tcs|.flattenTCs(|s|.expression)
to |tcs|.Compare with alternate definition in extends branch.
For a node |n|, shape |S|, graph |G|, a ShExSchema |Sch|, typing |m| and optional neighborhood |ext|, let |nei| be the neightborhood |ext| if it is not empty, otherwise evaluated as neigh(|G|, |n|)
. matchesShape(|n|, |S|, |G|, |Sch|, |m|, |ext|)
if and only if:
satisfies(|n|, |S|.extends|iExt|, |G|, |Sch|, |m|, p|iExt|)
, and matches(pi+j, |S|.expression, |m|)
.
If |tcs| is empty, |r| = |nei|
.|outs| = |r| ∩ arcsOut(|G|, |n|)
.|matchables| = Ø
(the empty set).|matchables| ∪ |unmatchables| = |outs|
.
matches: asserts that a triple expression is matched by a set of triples that come from the neighbourhood of a node in an RDF graph.
The expression matches(|T|, |expr|, |m|)
indicates that a set of triples |T| can satisfy these rules:
|expr| has semActs and matches(|T|, |expr|, |m|)
by the remaining rules in this list and the evaluation of semActs succeeds according to the section below on Semantic Actions.
{ "type": "OneOf", "shapeExprs": [te1, te2, …], "min": 2, "max": 3, "semActs": [SemAct1, SemAct2, …] }
(te1 | te2, …) {2,3}
%<SemAct1>% %<SemAct2>% …
{ "type": "OneOf", "shapeExprs": [te1, te2, …], "min": 2, "max": 3 }
(te1 | te2, …) {2,3}
|expr| has a cardinality of min and/or max not equal to 1, where a max of -1 is treated as unbounded, and |T| can be partitioned into |k| subsets T1, T2,…Tk such that min ≤ |k| ≤ max and for each Tn, matches(Tn, |expr|, |m|)
by the remaining rules in this list.
{ "type": "OneOf", "shapeExprs": [te1, te2, …], "min": 2, "max": 3 }
(te1 | te2, …) {2,3}
{ "type": "OneOf", "shapeExprs": [te1, te2, …] }
(te1 | te2, …)
|expr| is a OneOf and there is some shape expression |se2| in shapeExprs such that matches(|T|, |se2|, |m|)
.
{ "type": "OneOf", "shapeExprs": [
{ "type": "EachOf", "shapeExprs": [te3, te4, …] },
{ "type": "TripleExpression", "min": 1, "max": -1,
"predicate": "http://xmlns.com/foaf/0.1/name" }
] }
(te3 ; te4 ; …)
| <http://xmlns.com/foaf/0.1/name> . +
{ "type": "EachOf", "shapeExprs": [te3, te4, …] }
te3 ; te4 ; …
{ "type": "TripleExpression", "min": 1, "max": -1, "predicate": "http://xmlns.com/foaf/0.1/name" }
<http://xmlns.com/foaf/0.1/name> . +
|expr| is an EachOf and there is some partition of |T| into T1, T2,… such that for every expression expr1, expr2,… in shapeExprs, matches(Tn, exprn, |m|)
.
{ "type": "EachOf", "shapeExprs": [ { "type": "TripleExpression", "predicate": "http://xmlns.com/foaf/0.1/givenName" }, { "type": "TripleExpression", "predicate": "http://xmlns.com/foaf/0.1/familyName" } ] }
<http://xmlns.com/foaf/0.1/givenName> . ; <http://xmlns.com/foaf/0.1/familyName> .
{ "type": "TripleExpression", "predicate": "http://xmlns.com/foaf/0.1/givenName" }
<http://xmlns.com/foaf/0.1/givenName> .
{ "type": "TripleExpression", "predicate": "http://xmlns.com/foaf/0.1/familyName" }
<http://xmlns.com/foaf/0.1/familyName> .
|expr| is a TripleConstraint and:
|expr| has no valueExpr
{ "type": "TripleExpression", "predicate": "http://xmlns.com/foaf/0.1/givenName" }
<http://xmlns.com/foaf/0.1/givenName> . ;
http://xmlns.com/foaf/0.1/givenName
"or satisfies(|value|, valueExpr, |G|, |Sch|, |m|, |ext|)
.
{ "type": "TripleConstraint", "inverse": true, "predicate": "http://purl.org/dc/elements/1.1/author", "valueExpr": "http://schema.example/#IssueShape" }
^<http://purl.org/dc/elements/1.1/author> @<http://schema.example/#IssueShape>
|expr| is a tripleExprRef and satisfies(|value|, tripleExprWithId(tripleExprRef), |G|, |Sch|, |Sch|, |m|, |ext|)
.
The tripleExprWithId
function is defined in Triple Expression Reference Requirement below.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#EmployeeShape", "type": "Shape", "expression": { "type": "EachOf", "expressions": [ "http://schema.example/#nameExpr", { "type": "TripleConstraint", "predicate": "http://schema.example/#empID", "valueExpr": { "type": "NodeConstraint", "datatype": "http://www.w3.org/2001/XMLSchema#integer" } } ] } }, { "id": "http://schema.example/#PersonShape", "type": "Shape", "expression": { "id": "http://schema.example/#nameExpr", "type": "TripleConstraint", "predicate": "http://xmlns.com/foaf/0.1/name" } } ] }
<http://schema.example/#EmployeeShape> { &<http://schema.example/#nameExpr> ; <http://schema.example/#empID> <http://www.w3.org/2001/XMLSchema#integer> } <http://schema.example/#PersonShape> { $<http://schema.example/#nameExpr> <http://xmlns.com/foaf/0.1/name> . ; }
"http://schema.example/#PersonShape"
http://schema.example/#PersonShape
"The presence of imports requires that:
If any imported schema imports other schemas, shape and triple expression labels from those schemas are also in scope.
schema1: { "type": "Schema", "imports": ["http://schema.example/schema2"], "shapes": [ { "id": "http://schema.example/#EmployeeShape", "type": "Shape", "expression": { "type": "EachOf", "expressions": [ "http://schema.example/#nameExpr", { "type": "TripleConstraint", "predicate": "http://schema.example/#empID", "valueExpr": { "type": "NodeConstraint", "datatype": "http://www.w3.org/2001/XMLSchema#integer" } } ] } } ] } schema2: { "type": "Schema", "shapes": [ { "id": "http://schema.example/#PersonShape", "type": "Shape", "expression": { "id": "http://schema.example/#nameExpr", "type": "TripleConstraint", "predicate": "http://xmlns.com/foaf/0.1/name" } } ] }
schema1: <http://schema.example/#EmployeeShape> { &<http://schema.example/#nameExpr> ; <http://schema.example/#empID> <http://www.w3.org/2001/XMLSchema#integer> } schema2: <http://schema.example/#PersonShape> { $<http://schema.example/#nameExpr> <http://xmlns.com/foaf/0.1/name> ; }
Both the shape expression <PersonShape>
and the triple expression <nameExpr>
are in scope.
schema2's <nameExpr>
is referenced in schema1's <EmployeeShape>
Redundant imports are treated as a single import. This includes circular imports:
schema1: { "type": "Schema", "imports": ["http://schema.example/schema2", "http://schema.example/schema3"], "shapes": [ { "id": "http://schema.example/schema1#S1", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p1", "valueExpr": "http://schema.example/schema1#S2" } } ] } schema2: { "type": "Schema", "imports": ["http://schema.example/schema3"], "shapes": [ { "id": "http://schema.example/schema1#S2", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p2", "valueExpr": "http://schema.example/schema1#S3" } } ] } schema3: { "type": "Schema", "imports": ["http://schema.example/schema1"], "shapes": [ { "id": "http://schema.example/schema1#S3", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p3", "valueExpr": "http://schema.example/schema1#S1", "min": 0, } } ] }
schema1: IMPORT <http://schema.example/schema2> IMPORT <http://schema.example/schema3> <http://schema.example/schema1#S1> { <http://schema.example/#p1> @<http://schema.example/schema1#S2> schema2: IMPORT <http://schema.example/schema3> <http://schema.example/schema1#S2> { <http://schema.example/#p2> @<http://schema.example/schema1#S3> schema3: IMPORT <http://schema.example/schema1> <http://schema.example/schema1#S3> { <http://schema.example/#p3> @<http://schema.example/schema1#S1>?
When some schema |A| imports schema |B|, |B|'s start member is ignored.
schema1: { "type": "Schema", "imports": ["http://schema.example/schema2"], "shapes": [ { "id": "http://schema.example/schema1#S1", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p1", "valueExpr": "http://schema.example/schema1#S2" } } ] } schema2: { "type": "Schema", "start": "http://schema.example/schema1#S2", "shapes": [ { "id": "http://schema.example/schema1#S2", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p2" } } ] }
schema1: IMPORT <http://schema.example/schema2> <http://schema.example/schema1#S1> { <http://schema.example/#p1> @<http://schema.example/schema1#S2> } schema2: start=@<http://schema.example/schema1#S2> <http://schema.example/schema1#S2> { <http://schema.example/#p2> . }
schema1
has no start
even though it imports a schema with a start
.
It is an error if A
and B
share any labels for shape expressions or triple expressions or if schema B
has a startActs member.
schema1: { "type": "Schema", "imports": ["http://schema.example/schema2"], "shapes": [ { "id": "http://schema.example/schema1#S1", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p1", "valueExpr": "http://schema.example/schema1#S2" } } ] } schema2: { "type": "Schema", "startActs": [ { "type": "semAct", "name": "http://schema.example/schema1#A1" } ], "shapes": [ { "id": "http://schema.example/schema1#S1", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p1", "valueExpr": "http://schema.example/schema1#S2" } }, { "id": "http://schema.example/schema1#S2", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p2", "valueExpr": "http://schema.example/schema1#S3" } } ] }
schema1: IMPORT <http://schema.example/schema2> <http://schema.example/schema1#S1> { <http://schema.example/#p1> @<http://schema.example/schema1#S2> } schema2: %@<http://schema.example/schema1#A1>% <http://schema.example/schema1#S1> { <http://schema.example/#p1> @<http://schema.example/schema1#S2> } <http://schema.example/schema1#S2> { <http://schema.example/#p2> @<http://schema.example/schema1#S3> }
This import fails because:
<http://schema.example/schema1#S1>
has conflicting definitions andstart
directive and<http://schema.example/schema1#S3>
is not resolvable after imports.The semantics defined above assume three structural requirements beyond those imposed by the grammar of the abstract syntax. These ensure referential integrity and eliminate logical paradoxes such as those that arrise through the use of negation. These are not constraints expressed by the schema but instead those imposed on the schema.
A graph |G| is said to conform with a schema |S| with a ShapeMap |m| when:
A shapeExprRef MUST appear in the schema's shapes map (or an imported schema's map) and the corresponding shape expression MUST be a Shape with a shapeExpr.
The function shapeExprWithId(|shapeExprRef|)
returns the shape expression with an id of |shapeExprRef|.
Additionally, a shapeExprLabel cannot refer to itself through a shape reference either directly or recursively.
The shapeExprRef closure of a shape expression |se| is the set of shape expression labels used as references in |se|.
The shapeExprLabel |sl| belongs to shapeExprRefClosure(|se|)
if and only if:
shapeExprRefClosure(shapeExprWithId(|sl2|))
for some shapeExprLabel |sl2| that belongs to shapeExprRefClosure(|se|)
.
A shapes schema MUST NOT define a shape label |sl| that belongs to the shapeExprRef closure of its definition shapeExprWithId(|sl|)
.
Following are two valid shapeExprRefs:
{"type" : "Schema", "shapes" : [ { "id" : "http://schema.example/#PersonShape", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://xmlns.com/foaf/0.1/name" } }, { "id" : "http://schema.example/#EmployeeShape", "type" : "ShapeAnd", "shapeExprs" : [ "http://schema.example/#PersonShape", { "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#employeeNumber" } } ] } ] }
ex:PersonShape { foaf:name . } ex:EmployeeShape @ex:PersonShape AND { ex:employeeNumber . }
{"type" : "Schema", "shapes" : [ { "id" : "http://schema.example/#PersonShape", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://xmlns.com/foaf/0.1/name" } }, { "id" : "http://schema.example/#EmployeeShape", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#dependent", "valueExpr" : "http://schema.example/#PersonShape", "min" : 0, "max" : -1 } } ] }
ex:PersonShape { foaf:name . } ex:EmployeeShape { ex:dependent @ex:PersonShape * }
This shapeExprRef is invalid because there is no corresponding shape expression:
{ "type":"Schema", "shapes": [ { "id": "http://schema.example/#S1", "type":"Shape", "expression": "http://schema.example/#MissingShapeExpr" } ] }
ex:S1 { &ex:MissingShapeExpr }
This shapeExprRef is invalid because the referenced object is a triple expression instead of a shape expression:
{"type" : "Schema", "shapes" : [ { "id" : "http://schema.example/#CustomerShape", "type" : "Shape", "expression" : { "id" : "http://schema.example/#discountExpr", "type" : "TripleConstraint", "predicate" : "http://schema.example/#discount" } }, { "id" : "http://schema.example/#EmployeeShape", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#contactFor", "valueExpr" : "http://schema.example/#discountExpr" } } ] }
ex:CustomerShape { $ex:discountExpr ex:discount . } ex:EmployeeShape { ex:contactFor @ex:discountExpr }
These shapeExprRefs are invalid because they recursively refer to each other.
{"type" : "Schema", "shapes" : [ { "id" : "http://schema.example/#PersonShape", "type" : "ShapeAnd", "shapeExprs" : [ "http://schema.example/#EmployeeShape", { "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://xmlns.com/foaf/0.1/name" } } ] }, { "id" : "http://schema.example/#EmployeeShape", "type" : "ShapeAnd", "shapeExprs" : [ "http://schema.example/#PersonShape", { "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#employeeNumber" } } ] } ] }
ex:PersonShape @ex:EmployeeShape AND { foaf:name . } ex:EmployeeShape @ex:PersonShape AND { ex:employeeNumber . }
An tripleExprRef MUST identify a triple expression in the schema.
The function tripleExprWithId(|tripleExprRef|)
returns the triple expression with the id |tripleExprRef|.
Additionally, a tripleExprLabel cannot refer to itself through a triple expression reference either directly or recursively.
The tripleExprRef closure of a triple expression |te| is the set of triple expression labels used as references in |te|.
The tripleExprLabel |tl| belongs to tripleExprRefClosure(|te|)
if and only if:
tripleExprRefClosure(tripleExprWithId(|tl2|))
for some tripleExprLabel |tl2| that belongs to tripleExprRefClosure(|te|)
.
A shapes schema MUST NOT define a triple expression label |tl| that belongs to the tripleExprRef closure of its definition tripleExprWithId(|tl|)
.
Following is a valid triple expression reference:
{ "type":"Schema", "shapes": [ { "id": "http://schema.example/#PersonShape", "type":"Shape", "expression": { "id": "http://schema.example/#nameExpr", "type": "TripleConstraint", "predicate": "http://xmlns.com/foaf/0.1/name" } }, { "id": "http://schema.example/#EmployeeShape", "type":"Shape", "expression": { "type":"EachOf", "expressions": [ "http://schema.example/#nameExpr", { "type": "TripleConstraint", "predicate": "http://schema.example/#employeeNumber" } ] } } ] }
ex:PersonShape { $ex:nameExpr foaf:name . } ex:EmployeeShape { &ex:nameExpr ; ex:employeeNumber . }
This triple expression reference is invalid because there is no corresponding triple expression:
{ "type":"Schema", "shapes": [ { "id": "http://schema.example/#S1", "type":"Shape", "expression": "http://schema.example/#missingTripleExpr" } ] }
ex:S1 { &ex:missingTripleExpr }
This triple expression reference is invalid because the referenced object is a shape expression instead of a triple expression:
{ "type":"Schema", "shapes": [
{ "id": "http://schema.example/#CustomerShape",
"type":"ShapeAnd", "shapeExprs": [ … ]
},
{ "id": "http://schema.example/#PreferredCustomerShape",
"type":"Shape", "expression": { "type":"EachOf", "expressions": [
"http://schema.example/#CustomerShape",
{ "type": "TripleConstraint",
"predicate": "http://schema.example/#discount" }
] } } ] }
ex:CustomerShape { …; … } ex:PreferredCustomerShape { &ex:CustomerShape ; ex:discount . }
If any Shape |parent| appears in some Shape.extends or some shape traversed by flattenTCs, |parent| MUST NOT be an ExternalShape
.
It is likely that this restriction will be relaxed in the future. If you have use cases, please add them to the GitHub issue #117.
Following is a valid example with a shape with an expression and two extends:
{"type" : "Schema", "shapes" : [ { "id" : "http://schema.example/#EntityShape", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#entityId" } }, { "id" : "http://schema.example/#PersonShape", "type" : "ExternalShape" }, { "id" : "http://schema.example/#EmployeeShape", "type" : "Shape", "extends" : [ "http://schema.example/#PersonShape", "http://schema.example/#EntityShape" ], "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#employeeNumber" } } ] }
ex:EntityShape { ex:entityId . } EXTERNAL ex:PersonShape ex:EmployeeShape EXTENDS @ex:EntityShape EXTENDS @ex:PersonShape { ex:employeeNumber . }
Every shapeExprRef |referer| MUST identify at least one non-abstract shape.
Following is a valid example with a shape with a shapeExprRef that references an abstract shape with two non-abstract descendants:
{"type" : "Schema", "shapes" : [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#approvedBy", "valueExpr": "http://schema.example/#EngineerShape" } }, { "id" : "http://schema.example/#EntityShape", "type" : "Shape", "abstract": true, "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#entityId" } }, { "id" : "http://schema.example/#PersonShape", "type" : "Shape", "extends" : [ "http://schema.example/#EntityShape" ], "expression" : { "type" : "TripleConstraint", "predicate" : "http://xmlns.com/foaf/0.1/name" } }, { "id" : "http://schema.example/#EmployeeShape", "type" : "Shape", "extends" : [ "http://schema.example/#PersonShape" ], "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#employeeNumber" } } ] }
ex:IssueShape { ex:approvedBy @ex:EntityShape } ABSTRACT ex:EntityShape { ex:entityId . } ex:PersonShape EXTENDS @ex:EntityShape { foaf:name . } ex:EmployeeShape EXTENDS @ex:EntityShape { ex:employeeNumber . }
This shapeExprRef is invalid because it references only abstract descendants:
{"type" : "Schema", "shapes" : [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#approvedBy", "valueExpr": "http://schema.example/#EngineerShape" } }, { "id" : "http://schema.example/#EntityShape", "type" : "Shape", "abstract": true, "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#entityId" } }, { "id" : "http://schema.example/#PersonShape", "type" : "Shape", "abstract": true, "extends" : [ "http://schema.example/#EntityShape" ], "expression" : { "type" : "TripleConstraint", "predicate" : "http://xmlns.com/foaf/0.1/name" } }, { "id" : "http://schema.example/#EmployeeShape", "type" : "Shape", "abstract": true, "extends" : [ "http://schema.example/#PersonShape" ], "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#employeeNumber" } } ] }
ex:IssueShape { ex:approvedBy @ex:EntityShape } ABSTRACT ex:EntityShape { ex:entityId . } ABSTRACT ex:PersonShape EXTENDS @ex:EntityShape { foaf:name . } ABSTRACT ex:EmployeeShape EXTENDS @ex:EntityShape { ex:employeeNumber . }
A schema MUST NOT contain any Shape that has a negated reference to itself, either directly or transitively. This is formalized by the requirement that the dependency graph of a schema MUST NOT have a cycle that traverses some negated reference.
The set of atomic shapes of a shapeExpr |se| contains a Shape |s| if |s| or its id appears either directly or by shapeExprRef in |se|.
That is, |s| belongs to atomicShapes(|se|)
if and only if
atomicShapes(se2)
for some shape expression |se2| such that the id of |se2| belongs to the shapeExprRefClosure of |se|.
The set of atomic TripleConstraints of a tripleExpr |te| includes every TripleConstraint |tc| that appears directly or by tripleExprRef in |te|.
That is, |tc| belongs to atomicTripleConstraints(|te|)
if and only if:
tripleExprWithId(|tl|)
for some tripleExprLabel |tl| that belongs to tripleExprRefClosure(|te|)
.
The Shape |s1| has a reference to the Shape |s2| if
atomicTripleConstraints(|s1|.expression)
, and
atomicShapes(|tc|.valueExpr)
.
The reference from |s1| to |s2| is a negated reference if
The dependency graph of the schema |Sch| is the graph which vertices are all the Shapes that appear in some shape expression in the shapes of |Sch|, and that has two kinds of edges: negative and positive. There is a negative edge from |s1| to |s2| if |s1| has a negated reference to |s2|. There is a positive edge from |s1| to |s2| if |s1| has a reference but not a negated reference to |s2|.
This negated self-reference violates the negation requirement.
{ "type": "Schema", "shapes": [
{ "id": "http://schema.example/#S",
"type": "Shape",
"expression": { "type": "TripleConstraint",
"predicate": "http://schema.example/#p",
"valueExpr": { "type": "ShapeNot",
"shapeExpr": "http://schema.example/#S" } } }
] }
ex:S {ex:p NOT @ex:S}
This indirect self-reference does not violate the negation requirement.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#US", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#Up", "valueExpr": { "type": "ShapeNot", "shapeExpr": "http://schema.example/#UT" } } }, { "id": "http://schema.example/#UT", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#Uq", "valueExpr": "http://schema.example/#US" } } ] }
ex:US {ex:Up NOT @ex:UT} ex:UT { ex:Uq @ex:US}
This negated, indirect self-reference violates the negation requirement.
{"type" : "Schema", "shapes" : [ { "id" : "http://schema.example/#S", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#p", "valueExpr" : { "type" : "ShapeNot", "shapeExpr" : "http://schema.example/#T" } } } , { "id" : "http://schema.example/#T", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#q", "valueExpr" : "http://schema.example/#S" } } ] }
ex:S { ex:p NOT @ex:T } ex:T { ex:q @ex:S }
This is a direct, negated self-reference of the shape with id ex:T and violates the negation requirement.
{"type" : "Schema", "shapes" : [ { "id" : "http://schema.example/#T", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#p", "valueExpr" : "http://schema.example/#S" } } , { "id" : "http://schema.example/#S", "type" : "ShapeAnd", "shapeExprs" : [ { "type" : "ShapeNot", "shapeExpr" : "http://schema.example/#T" }, "http://schema.example/#U" ] }, { "id" : "http://schema.example/#U", "type" : "Shape" } ] }
ex:T { ex:p @ex:S } ex:S (NOT @ex:T) AND @ex:U ex:U .
This doubly-negated self-reference of ex:T does not violate the negation requirement.
{"type" : "Schema", "shapes" : [{ "id" : "http://schema.example/#T", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#p", "valueExpr" : "http://schema.example/#S" } } , { "id" : "http://schema.example/#S", "type" : "ShapeNot", "shapeExpr" : { "type" : "ShapeAnd", "shapeExprs" : [ { "type" : "ShapeNot", "shapeExpr" : "http://schema.example/#T" }, "http://schema.example/#U" ] } }, { "id" : "http://schema.example/#U", "type" : "Shape" } ] }
ex:T { ex:p @ex:S } ex:S NOT ( (NOT @ex:T) AND @ex:U ) ex:U .
There is a cycle of negated references between the shape that defines ex:T and the shape that defines ex:U, so the negation requirement is violated.
{"type" : "Schema", "shapes" : [{ "id" : "http://schema.example/#T", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#p", "valueExpr" : { "type" : "ShapeNot", "shapeExpr" : "http://schema.example/#S" } } } , { "id" : "http://schema.example/#U", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#q", "valueExpr" : "http://schema.example/#S" } }, { "id" : "http://schema.example/#S", "type" : "ShapeAnd", "shapeExprs" : [ { "type" : "ShapeNot", "shapeExpr" : "http://schema.example/#T" }, "http://schema.example/#U" ] } ] }
ex:T { ex:p NOT @ex:S } ex:U { ex:q @ex:S } ex:S (NOT @ex:T) AND @ex:U
This satisfies the negation requirement, as ex:U does not refer to ex:T (compared to the previous example).
{"type" : "Schema", "shapes" : [{ "id" : "http://schema.example/#T", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#p", "valueExpr" : { "type" : "ShapeNot", "shapeExpr" : "http://schema.example/#S" } } } , { "id" : "http://schema.example/#U", "type" : "Shape", "expression" : { "type" : "TripleConstraint", "predicate" : "http://schema.example/#q" } }, { "id" : "http://schema.example/#S", "type" : "ShapeAnd", "shapeExprs" : [ { "type" : "ShapeNot", "shapeExpr" : "http://schema.example/#T" }, "http://schema.example/#U" ] } ] }
ex:T { ex:p NOT @ex:S } ex:U { ex:q . } ex:S (NOT @ex:T) AND @ex:U
This self-reference on a predicate designated as extra violates the negation requirement:
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#S", "type": "Shape", "extra": [ "http://schema.example/#p" ], "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p", "valueExpr": "http://schema.example/#S" } } ] }
ex:S EXTRA ex:p { ex:p @ex:S }
The same shape with a negated self-reference still violates the negation requirement because the reference occurs with a ShapeNot:
{ "type": "Schema", "shapes": [
{ "id": "http://schema.example/#S",
"type": "Shape",
"extra": [ "http://schema.example/#p" ],
"expression": {
"type": "TripleConstraint",
"predicate": "http://schema.example/#p",
"valueExpr": {
"type": "ShapeNot", "shapeExpr": "http://schema.example/#S"
} } } ] }
ex:S EXTRA ex:p { ex:p NOT @ex:S }
Semantic actions serve as an extension point for Shape Expressions. They appear in lists in Schema's startActs and Shape, OneOf, EachOf and TripleConstraint's semActs.
A semantic action is a tuple of an identifier and some optional code:
The evaluation semActsSatisfied on a list of SemActs returns success or failure. The evaluation of an individual SemAct is implementation-dependent.
A practical evaluation of a SemAct will provide access to some context.
For instance, the http://shex.io/extensions/Test/ extension requires access to the subject, predicate and object of a triple matching a TripleConstraint.
These are used in a print
function.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#S1", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#p1", "min": 1, "max": -1, "semActs": [ { "type": "SemAct", "code": " print(s) ", "name": "http://shex.io/extensions/Test/" }, { "type": "SemAct", "code": " print(o) ", "name": "http://shex.io/extensions/Test/" } ] } } ] }
ex:S1 { ex:p1 .+ %Test:{ print(s) %} %Test:{ print(o) %} }
<http://a.example/n1> <http://a.example/p1> <http://a.example/o1> . <http://a.example/n2> <http://a.example/p1> "a", "b" . <http://a.example/n3> <http://a.example/p2> <http://a.example/o2> .
node | shape | result | print arguments |
---|---|---|---|
<n1> | <S1> | pass | http://a.example/s1 http://a.example/o1 |
<n2> | <S1> | pass | http://a.example/s1 "a" http://a.example/s1 "b" |
<n3> | <S1> | fail |
Annotations provide a format-independent way to provide additional information about elements in a schema. They appear in lists in Shape, OneOf, EachOf and TripleConstraint's annotations.
Annotation | { | predicate:IRIREF object:objectValue } |
---|
Annotations do not affect whether a node conforms to some shape. Because they are part of the structure of the schema, they can be parsed in one ShEx format and emitted in that format or another.
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#status", "annotations": [ { "type": "Annotation", "predicate": "http://www.w3.org/2000/01/rdf-schema#comment", "object": {"value": "Represents reported software issues."} }, { "type": "Annotation", "predicate": "http://www.w3.org/2000/01/rdf-schema#label", "object": {"value": "software issue"} } ] } } ] }
ex:IssueShape { ex:status . // rdfs:comment "Represents reported software issues." // rdfs:label "software issue" }
The following examples demonstrate proofs for validations in the form of a nested list of invocations of the evaluation functions defined above.
S1 nc1
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IntConstraint", "type": "NodeConstraint", "datatype": "http://www.w3.org/2001/XMLSchema#integer" } ] }
ex:IntConstraint xsd:integer
Here the shape identified by http://schema.example/#IntConstraint
is a shape expression consisting of a single NodeConstraint.
Per Shape Expression Semantics, "30"^^<http://www.w3.org/2001/XMLSchema#integer> satisfies IntConstraint.
This document uses this nested tree convention to indicate that the dependency of an evaluation on those nested inside it. Nesting is expressed as indentation. Here, the evaluation of satisfies NodeConstraint ("30"^^xsd:integer, S1, G, m) depends on satisfies2 NodeConstraint ("30"^^xsd:integer, S1).
Validating a shape requires evaluating it's triple expression as well as the variables and functions neigh(|G|, |n|), |matched|, |remainder|, |outs|, |matchables| and |unmatchables|:
S1 tc1
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#UserShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#shoeSize" } } ] }
ex:UserShape { ex:shoeSize . }
t1
BASE <http://a.example/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> <Alice> ex:shoeSize "30"^^xsd:integer .
It is quite common that Shapes will constrain their nested TripleConstraints with NodeConstraints. Here is an example including that, extra triples and a closed shape:
S1 tc1 nc1
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#UserShape", "type": "Shape", "extra": ["http://www.w3.org/1999/02/22-rdf-syntax-ns#type"], "expression": { "type": "TripleConstraint", "predicate": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "valueExpr": { "type": "NodeConstraint", "values": ["http://schema.example/#Teacher"] } } } ] }
ex:UserShape EXTRA a { a [ex:Teacher] }
t1 t2 t3 t4 t5
BASE <http://a.example/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> <Alice> ex:shoeSize "30"^^xsd:integer . <Alice> a ex:Teacher . <Alice> a ex:Person . <SomeHat> ex:owner <Alice> . <TheMoon> ex:madeOf <GreenCheese> .
The non-empty matchables is permitted because the triple t3
has a predicate which appears in the "extra" list: ["http://schema.example/#Teacher"].
S1 te1 tc1 nc1 te2 tc2 nc2 tc3 nc3
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#UserShape", "type": "Shape", "expression": {"type": "OneOf", "expressions": [ { "type": "TripleConstraint", "predicate": "http://xmlns.com/foaf/0.1/name", "valueExpr": { "type": "NodeConstraint", "nodeKind": "literal" } }, { "type": "EachOf", "expressions": [ { "type": "TripleConstraint", "min": 1, "max": -1 , "predicate": "http://xmlns.com/foaf/0.1/givenName", "valueExpr": { "type": "NodeConstraint", "nodeKind": "literal" } }, { "type": "TripleConstraint", "predicate": "http://xmlns.com/foaf/0.1/familyName", "valueExpr": { "type": "NodeConstraint", "nodeKind": "literal" } } ] } ] } } ] }
ex:UserShape { ( # extra ()s to clarify alignment with ShExJ foaf:name LITERAL | ( # extra ()s to clarify alignment with ShExJ foaf:givenName LITERAL+ ; foaf:familyName LITERAL ) ) }
t1 t2 t3 t4 t5 t6
BASE <http://a.example/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> <Alice> foaf:givenName "Alice" . <Alice> foaf:givenName "Malsenior" . <Alice> foaf:familyName "Walker" . <Alice> foaf:mbox <mailto:alice@example.com> . <Bob> foaf:knows <Alice> . <Bob> foaf:mbox <mailto:bob@example.com> .
Per Shape Expression Semantics, <Alice> satisfies S1 with the simple ShapeMap
m:
{ "http://a.example/Alice": "http://a.example/UserShape }
as seen in this validation.
Replacing triples 1-3 with a single foaf:name property will also satisfy the schema.
t4 t5 t6 t7
BASE <http://a.example/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> <Alice> foaf:mbox <mailto:alice@example.com> . <Bob> foaf:knows <Alice> . <Bob> foaf:mbox <mailto:bob@example.com> . <Alice> foaf:name "Alice Malsenior Walker" .
Any mixure of foaf:name
with foaf:givenName
or foaf:familyName
will fail to satisfy the schema as there will be a matchable triple t3 that is not used in the triple expression te1.
t3 t4 t5 t6 t7
BASE <http://a.example/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
<Alice> foaf:familyName "Walker" .
<Alice> foaf:mbox <mailto:alice@example.com> .
<Bob> foaf:knows <Alice> .
<Bob> foaf:mbox <mailto:bob@example.com> .
<Alice> foaf:name "Alice Malsenior Walker" .
Adding a foaf:familyName
to S1's extra would allow this graph to satisfy the schema.
S1
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#UserShape", "type": "Shape", "extra": ["http://xmlns.com/foaf/0.1/familyName"] … } ] }
Closing S1 would also cause a validation failure if |unmatchables| were not empty:
S1
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#UserShape", "type": "Shape", "closed": true … } ] }
S1 tc1 nc1 S2 tc2 nc2
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#reproducedBy", "valueExpr": "http://schema.example/#TesterShape" } }, { "id": "http://schema.example/#TesterShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#role", "valueExpr": { "type": "NodeConstraint", "values": [ "http://schema.example/#testingRole" ] } } } ] }
ex:IssueShape { ex:reproducedBy @ex:TesterShape } ex:TesterShape { ex:role [ex:testingRole] }
t1 t2
PREFIX ex: <http://schema.example/#> PREFIX inst: <http://inst.example/> inst:Issue1 ex:reproducedBy inst:Tester2 . inst:Tester2 ex:role ex:testingRole .
inst:Issue1 satisfies S1 with the ShapeMap
m:
{ "http://inst.example/Issue1": "http://schema.example/#IssueShape", "http://inst.example/Tester2": "http://schema.example/#TesterShape", "http://inst.example/Testgrammer23": "http://schema.example/#ProgrammerShape" }
as seen in this evaluation:
S1 tc1 nc1
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "TripleConstraint", "min": 0, "max": -1, "predicate": "http://schema.example/#related", "valueExpr": "http://schema.example/#IssueShape" } } ] }
ex:IssueShape { ex:related @ex:IssueShape* }
t1 t2 t3
PREFIX ex: <http://schema.example/#> PREFIX inst: <http://inst.example/> inst:Issue1 ex:related inst:Issue2 . inst:Issue2 ex:related inst:Issue3 . inst:Issue3 ex:related inst:Issue1 .
inst:Issue1 satisfies S1 with the ShapeMap
m:
{ "http://inst.example/Issue1": "http://schema.example/#IssueShape", "http://inst.example/Issue2": "http://schema.example/#IssueShape", "http://inst.example/Issue3": "http://schema.example/#IssueShape" }
as seen in this evaluation:
S1 te1 tc1 nc1 tc2 nc2
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#TestResultsShape", "type": "Shape", "expression": { "type": "EachOf", "expressions": [ { "type": "TripleConstraint", "min": 1, "max": -1, "predicate": "http://schema.example/#val", "valueExpr": { "type": "NodeConstraint", "values": [ {"value": "a"}, {"value": "b"}, {"value": "c"} ] } }, { "type": "TripleConstraint", "min": 1, "max": -1, "predicate": "http://schema.example/#val", "valueExpr": { "type": "NodeConstraint", "values": [ {"value": "b"}, {"value": "c"}, {"value": "d"} ] } } ] } } ] }
<http://schema.example/#TestResultsShape> { <http://schema.example/#val> ["a" "b" "c"]+ ; <http://schema.example/#val> ["b" "c" "d"]+ }
t1 t2 t3 t4
BASE <http://a.example/> PREFIX ex: <http://schema.example/#> <s> ex:val "a" . <s> ex:val "b" . <s> ex:val "c" . <s> ex:val "d" .
<s> satisfies S1 with:
m:
{ "http://a.example/s": "http://a.example/S1" }
If tc1 consumes as many triples as it can, it consumes three and tc2 consumes one:
If we eliminate t4, either t2 or t3 must be allocated to tc2:
S1 te1 tc1 nc1 tc2 nc2 S2 tc3 nc3 S3 tc4 nc4
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#IssueShape", "type": "Shape", "expression": { "type": "EachOf", "expressions": [ { "type": "TripleConstraint", "predicate": "http://schema.example/#reproducedBy", "valueExpr": "http://schema.example/#TesterShape" }, { "type": "TripleConstraint", "predicate": "http://schema.example/#reproducedBy", "valueExpr": "http://schema.example/#ProgrammerShape" } ] } }, { "id": "http://schema.example/#TesterShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#role", "valueExpr": { "type": "NodeConstraint", "values": [ "http://schema.example/#testingRole" ] } } }, { "id": "http://schema.example/#ProgrammerShape", "type": "Shape", "expression": { "type": "TripleConstraint", "predicate": "http://schema.example/#department", "valueExpr": { "type": "NodeConstraint", "values": [ "http://schema.example/#ProgrammingDepartment" ] } } } ] }
ex:IssueShape { ex:reproducedBy @ex:TesterShape; ex:reproducedBy @ex:ProgrammerShape } ex:TesterShape { ex:role [ex:testingRole] } ex:ProgrammerShape { ex:department [ex:ProgrammingDepartment] }
t1 t2 t3 t4 t5
PREFIX ex: <http://schema.example/#> PREFIX inst: <http://inst.example/> inst:Issue1 ex:reproducedBy inst:Tester2 ; ex:reproducedBy inst:Testgrammer23 . inst:Tester2 ex:role ex:testingRole . inst:Testgrammer23 ex:role ex:testingRole ; ex:department ex:ProgrammingDepartment .
inst:Issue1 satisfies S1 with the ShapeMap
m:
{ "http://inst.example/Issue1": "http://schema.example/#IssueShape", "http://inst.example/Tester2": "http://schema.example/#TesterShape", "http://inst.example/Testgrammer23": "http://schema.example/#ProgrammerShape" }
as seen in this evaluation:
Setting the maximum cardinality of a TripleConstraint with predicate |p| to zero (i.e. "max": 0 in ShExJ or {0}
or {0, 0}
in ShExC) asserts that matching nodes must have no triples with predicate |p|.
S1 te1 tc1 nc1 tc2
{ "type": "Schema", "shapes": [ { "id": "http://schema.example/#TestResultsShape", "type": "Shape", "expression": { "type": "EachOf", "expressions": [ { "type": "TripleConstraint", "min": 1, "max": -1, "predicate": "http://schema.example/#p1", "valueExpr": { "type": "NodeConstraint", "values": [ {"value": "a"}, {"value": "b"} ] } }, { "type": "TripleConstraint", "min": 1, "max": -1, "predicate": "http://schema.example/#p2", "min": 0, "max": 0 } ] } } ] }
<http://schema.example/#TestResultsShape> { <http://schema.example/#p1> ["a" "b"] + ; <http://schema.example/#p2> . {0} }
t1
BASE <http://a.example/> PREFIX ex: <http://schema.example/#> <s> ex:p1 "a" .
<s> satisfies S1 with:
m:
{ "http://a.example/s": "http://a.example/S1" }
This is trivially satisfied by tc1 consuming one triple and tc2 consuming none:
If we add a t2 which matches tc2:
t1 t2
BASE <http://a.example/> PREFIX ex: <http://schema.example/#> <s> ex:p1 "a" . <s> ex:p2 5 .
every partition fails, either because matchables is non-empty or because the maximum cardinality on tc2 is exceeded:
The ShEx Compact Syntax expresses ShEx schemas in a compact, human-friendly form.
Parsing ShExC transforms a ShExC document into an equivalent ShExJ structure.
This is defined as a BNF which accepts ShExC followed by instructions for tranlating the rules in the BNF production into their corresponding ShExJ objects.
For example, "shapeExprDecl returns shapeExpression" indicates that the result of matching the shapeExprDecl
production is the object produced by parsing the shapeExpression
production.
Semantic actions before the first shape expression declaration are startActs. After the first shape expression declaration, semantic actions are associated with the previous declaration.
Below is the ShExC grammar following the notation in the XML specification[[!XML]]: | |||
[1] | shexDoc |
::= | directive* ((notStartAction | startActions) statement*)? |
followed by the associated ShExJ object(s): | |||
Schema | { | startActs:[SemAct+]? start:shapeExpr? imports:[IRIREF+]? shapes:[shapeExpr+]? } | |
and a description of the mapping of rules in the production to elements of the ShExJ object: | |||
| |||
[2] | directive |
::= | baseDecl | prefixDecl | importDecl |
[3] | baseDecl |
::= | "BASE" IRIREF |
[4] | prefixDecl |
::= | "PREFIX" PNAME_NS IRIREF |
[4½] | importDecl |
::= | "IMPORT" IRIREF |
"IMPORT " is described in ShEx Import. | |||
[5] | notStartAction |
::= | start | shapeExprDecl |
[6] | start |
::= | "start" '=' inlineShapeExpression |
[7] | startActions |
::= | codeDecl+ |
[8] | statement |
::= | directive | notStartAction |
[9] | shapeExprDecl |
::= | shapeExprLabel (shapeExpression | "EXTERNAL") |
If the "EXTERNAL " keyword is present, shapeExprDecl returns a ShapeExternal object: | |||
ShapeExternal | { | id:shapeExprLabel? } | |
otherwise shapeExprDecl returns shapeExpression. | |||
Shape expressions are logical combinations of shape atoms. Inline variants of shape expressions are used in tripleConstraints and are not permitted to have annotations or semantic actions. | |||
[10] | shapeExpression |
::= | shapeOr |
[11] | inlineShapeExpression |
::= | inlineShapeOr |
[12] | shapeOr |
::= | shapeAnd ("OR" shapeAnd)* |
[13] | inlineShapeOr |
::= | inlineShapeAnd ("OR" inlineShapeAnd)* |
If the right shapeAnd matches one or more times, the result is a ShapeOr object with shapeExprs containing the first shapeAnd followed by the ordered list from the second shapeAnd: | |||
ShapeOr | { | id:shapeExprLabel? shapeExprs:[shapeExpr{2,}] } | |
otherwise the result is the left shapeAnd. | |||
[14] | shapeAnd |
::= | shapeNot ("AND" shapeNot)* |
[15] | inlineShapeAnd |
::= | inlineShapeNot ("AND" inlineShapeNot)* |
If the right shapeNot matches one or more times, the result is a ShapeAnd object with shapeExprs containing the first shapeNot followed by the ordered list from the second shapeNot: | |||
ShapeAnd | { | id:shapeExprLabel? shapeExprs:[shapeExpr{2,}] } | |
otherwise the result is the left shapeNot. | |||
[16] | shapeNot |
::= | "NOT"? shapeAtom |
[17] | inlineShapeNot |
::= | "NOT"? inlineShapeAtom |
If the left "NOT " matches, the result is a ShapeNot object with shapeExpr containing the shapeAtom: | |||
ShapeNot | { | id:shapeExprLabel? shapeExpr:shapeExpr } | |
otherwise the result is the shapeAtom. | |||
Shape atoms are shape references (indicated by " | |||
[18] | shapeAtom |
::= | nonLitNodeConstraint shapeOrRef? |
[19] | shapeAtomNoRef |
::= | nonLitNodeConstraint shapeOrRef? |
[20] | inlineShapeAtom |
::= | nonLitNodeConstraint inlineShapeOrRef? |
| |||
[21] | shapeOrRef |
::= | shapeDefinition | shapeRef
|
[22] | inlineShapeOrRef |
::= | inlineShapeDefinition | shapeRef
|
[23] | shapeRef |
::= | ATPNAME_LN | ATPNAME_NS | '@' shapeExprLabel |
| |||
shapeExprRef | = | shapeExprLabel ; | |
Node constraints identify a (possibly infinite) set of matching RDF nodes. | |||
[24] | litNodeConstraint |
::= | "LITERAL" xsFacet* |
[25] | nonLitNodeConstraint |
::= | nonLiteralKind stringFacet* |
NodeConstraint | { | id:shapeExprLabel? nodeKind:("iri" | "bnode" | "nonliteral" | "literal")? datatype:IRIREF? xsFacet* values:[valueSetValue+]? } | |
[26] | nonLiteralKind |
::= | "IRI" | "BNODE" | "NONLITERAL" |
[27] | xsFacet |
::= | stringFacet | numericFacet |
xsFacet | = | stringFacet | numericFacet ; | |
[28] | stringFacet |
::= | stringLength INTEGER |
[29] | stringLength |
::= | "LENGTH" | "MINLENGTH" | "MAXLENGTH" |
stringFacet | = | (length|minlength|maxlength):INTEGER | pattern:STRING flags:STRING? ; | |
[30] | numericFacet |
::= | numericRange numericLiteral |
[31] | numericRange |
::= | "MININCLUSIVE" | "MINEXCLUSIVE" | "MAXINCLUSIVE" | "MAXEXCLUSIVE" |
[32] | numericLength |
::= | "TOTALDIGITS" | "FRACTIONDIGITS" |
numericFacet | = | (mininclusive|minexclusive|maxinclusive|maxexclusive):numericLiteral | (totaldigits|fractiondigits):INTEGER ; | |
Shape defintions associate a triple expression with a closed flag and a list of partially constrained (extra) predicates. Any predicate appearing in a triple expression is fully constrained unless it appears in the list of extras. | |||
[33] | shapeDefinition |
::= | (extraPropertySet | "CLOSED")* '{' tripleExpression? '}' annotation* semanticActions |
[34] | inlineShapeDefinition |
::= | (extraPropertySet | "CLOSED")* '{' tripleExpression? '}' |
Shape | { | id:shapeExprLabel? closed:BOOL? extra:[IRIREF+]? expression:tripleExpr? semActs:[SemAct+]? annotations:[Annotation+]? } | |
| |||
[35] | extraPropertySet |
::= | "EXTRA" predicate+ |
Triple expressions are arrangements of triple constraints. | |||
[36] | tripleExpression |
::= | oneOfTripleExpr |
[37] | oneOfTripleExpr |
::= | groupTripleExpr | multiElementOneOf |
[38] | multiElementOneOf |
::= | groupTripleExpr ('|' groupTripleExpr)+ |
If the right groupTripleExpr matches one or more times, the result is a OneOf object with expressions containing the first groupTripleExpr followed by the ordered list from the second groupTripleExpr: | |||
OneOf | { | id:tripleExprLabel? expressions:[tripleExpr{2,}] min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } | |
otherwise the result is the left groupTripleExpr. | |||
[40] | groupTripleExpr |
::= | singleElementGroup | multiElementGroup |
[41] | singleElementGroup |
::= | unaryTripleExpr ';'? |
[42] | multiElementGroup |
::= | unaryTripleExpr (';' unaryTripleExpr)+ ';'? |
If the right unaryTripleExpr matches one or more times, the result is a EachOf object with expressions containing the first unaryTripleExpr followed by the ordered list from the second unaryTripleExpr: | |||
EachOf | { | id:tripleExprLabel? expressions:[tripleExpr{2,}] min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } | |
otherwise the result is the left unaryTripleExpr. | |||
[43] | unaryTripleExpr |
::= | ('$' tripleExprLabel)? (tripleConstraint | bracketedTripleExpr) |
[44] | bracketedTripleExpr |
::= | '(' tripleExpression ')' cardinality? annotation* semanticActions |
Triple constraints are matched against RDF triples. | |||
[45] | tripleConstraint |
::= | senseFlags? predicate inlineShapeExpression cardinality? annotation* semanticActions |
TripleConstraint | { | id:tripleExprLabel? inverse:BOOL? predicate:IRIREF valueExpr:shapeExpr? min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } | |
| |||
[46] | cardinality |
::= | '*' | '+' | '?' | REPEAT_RANGE |
In ShExJ, "*" is represented as -1 , standing for the unbounded cardinality..
| |||
[47] | senseFlags |
::= | '^' |
Value sets identify ranges of RDF nodes by explicit inclusion or by range (indicated by " | |||
[48] | valueSet |
::= | '[' valueSetValue* ']' |
[49] | valueSetValue |
::= | iriRange | literalRange | languageRange |
If ". " matches and exclusion matches one or more times, all matched items must be consistently iri, literal, or language. valueSetValue returns either a IriStemRange, LiteralStemRange, or LanguageStemRange object with exclusions equal to the set of results of exclusion: | |||
IriStemRange | { | stem:(IRIREF | Wildcard) exclusions:[IRIREF|IriStem +] } | |
LiteralStemRange | { | stem:(STRING | Wildcard) exclusions:[STRING|LiteralStem +] } | |
LanguageStemRange | { | stem:(LANGTAG | Wildcard) exclusions:[LANGTAG|LanguageStem +] } | |
If "~ " matches with no exclusion, valueSetValue returns a Wildcard object: | |||
Wildcard | { | /* empty */ } | |
[50] | exclusion |
::= | '.' '-' (iri | literal | LANGTAG) '~'? |
[51] | iriRange |
::= | iri ('~' iriExclusion*)? |
If iri matches with no "~ ", iriRange returns iri. | |||
If iri and "~ " match with no iriExclusion, iriRange returns a IriStem object: | |||
IriStem | { | stem:IRIREF } | |
If iri and "~ " match and iriExclusion matches one or more times, iriRange returns a IriStemRange object with exclusions equal to the set of results of iriExclusion: | |||
IriStemRange | { | stem:(IRIREF | Wildcard) exclusions:[IRIREF|IriStem +] } | |
[52] | iriExclusion |
::= | '-' iri '~'? |
[53] | literalRange |
::= | literal ('~' literalExclusion*)? |
If literal matches with no "~ ", literalRange returns literal. | |||
If literal and "~ " match with no literalExclusion, literalRange returns a LiteralStem object: | |||
LiteralStem | { | stem:STRING } | |
If literal and "~ " match and literalExclusion matches one or more times, literalRange returns a LiteralStemRange object with exclusions equal to the set of results of literalExclusion: | |||
LiteralStemRange | { | stem:(STRING | Wildcard) exclusions:[STRING|LiteralStem +] } | |
[54] | literalExclusion |
::= | '-' literal '~'? |
[55] | languageRange |
::= | LANGTAG ('~' languageExclusion*)? |
If LANGTAG matches with no "~ " match , languageRange returns a Language object with languageTag equal to LANGTAG: | |||
Language | { | languageTag:LANGTAG } | |
If LANGTAG and "~ " match with no languageExclusion, languageRange returns a LanguageStem object: | |||
LanguageStem | { | stem:LANGTAG } | |
If LANGTAG and "~ " match and languageExclusion matches one or more times, languageRange returns a LanguageStemRange object with exclusions equal to the set of results of languageExclusion: | |||
LanguageStemRange | { | stem:(LANGTAG | Wildcard) exclusions:[LANGTAG|LanguageStem +] } | |
If '@' '~' matched with no languageExclusion, languageRange returns a LanguageStemRange object with an empty stem: | |||
LanguageStemRange | { | stem: "" } | |
If '@' '~' matched and languageExclusion matches one or more times, languageRange returns a LanguageStemRange object with an empty stem ad exclusions equal to the set of results of languageExclusion: | |||
LanguageStemRange | { | stem: "" exclusions:[LANGTAG|LanguageStem +] } | |
[56] | languageExclusion |
::= | '-' LANGTAG '~'? |
Triple expressions can include the shapeExpression in a shapeExprDecl. | |||
[57] | include |
::= | '&' tripleExprLabel |
Per the triple expression refrence requirement, tripleExprLabel property MUST appear in the schema's shapes map and the corresponding triple expression MUST be a Shape with a tripleExpr. | |||
tripleExprRef | = | tripleExprLabel ; | |
Triple expressions can include annotations in the form of a tuple of a predicate and an iri or literal. | |||
[58] | annotation |
::= | "//" predicate (iri | literal) |
Annotation | { | predicate:IRIREF object:objectValue } | |
Triple expressions can include semantic actions consisting of an iri and an optional code string. | |||
[59] | semanticActions |
::= | codeDecl* |
[60] | codeDecl |
::= | '%' iri (CODE | '%') |
SemAct | { | name:IRIREF code:STRING? } | |
The remaining productions come from the specifications for SPARQL and Turtle. | |||
[13t] | literal |
::= | rdfLiteral | numericLiteral | booleanLiteral |
[61] | predicate |
::= | iri | RDF_TYPE |
[62] | datatype |
::= | iri |
[63] | shapeExprLabel |
::= | iri | blankNode |
[64] | tripleExprLabel |
::= | iri | blankNode |
[16t] | numericLiteral |
::= | INTEGER | DECIMAL | DOUBLE |
[65] | rdfLiteral |
::= | langString | string ("^^" datatype)? |
returns: literal | The literal has a lexical form of the first rule argument, String . If the '^^' iri rule matched, the datatype is iri and the literal has no language tag. If the langString rule matched, the datatype is rdf:langString and the language tag is extracted from langTag . If neither matched, the datatype is xsd:string and the literal has no language tag. | ||
[134s] | booleanLiteral |
::= | "true" | "false" |
returns: literal | The literal has a lexical form of the true or false , depending on which matched the input, and a datatype of xsd:boolean . | ||
[135s] | string |
::= | STRING_LITERAL1 | STRING_LITERAL_LONG1 |
[66] | langString |
::= | LANG_STRING_LITERAL1 | LANG_STRING_LITERAL_LONG1 |
[136s] | iri |
::= | IRIREF | prefixedName |
[137s] | prefixedName |
::= | PNAME_LN | PNAME_NS |
[138s] | blankNode |
::= | BLANK_NODE_LABEL |
TerminalsTerminals return:
| |||
[67] | <CODE > |
::= | "{" ([^%\\] | "\\" [%\\] | UCHAR)* "%" "}" |
returns: a string of unicode codepoints | The characters between "{" and "%}" are taken, with the numeric escape sequences unescaped, to form the unicode string of the IRI. | ||
[68] | <REPEAT_RANGE > |
::= | "{" INTEGER ( "," (INTEGER | "*")? )? "}" |
returns: repeat range | The base-10 numeric values of INTEGER are taken or a non-negative integer and an * token if "* " was matched. | ||
[69] | <RDF_TYPE > |
::= | "a" |
returns: IRI | The iri http://www.w3.org/1999/02/22-rdf-syntax-ns# is returned. | ||
[18t] | <IRIREF > |
::= | "<" ([^#0000- <>\"{}|^`\\] | UCHAR)* ">" |
returns: IRI | The characters between "<" and ">" are taken, with the numeric escape sequences unescaped, to form the unicode string of the IRI. Relative IRI resolution is performed per Turtle Section 6.3. | ||
[140s] | <PNAME_NS > |
::= | PN_PREFIX? ":" |
returns: PREFIX | When used in a prefixDecl production, the prefix is a potentially empty unicode string matching the first argument of the rule and serves as a key into the prefixes map. | ||
returns: IRI | When used elsewhere, the iri is the value in the prefixes map corresponding to the first argument of the rule. | ||
[141s] | <PNAME_LN > |
::= | PNAME_NS PN_LOCAL |
returns: IRI | A potentially empty prefix is identified by the first token, PNAME_NS . The prefixes map MUST have a corresponding namespace . The unicode string of the IRI is formed by unescaping the reserved characters [[!rfc7159]] in the second argument, PN_LOCAL , and concatenating this onto the namespace . | ||
[70] | <ATPNAME_NS > |
::= | "@" PNAME_NS |
returns: IRI | The iri is the value in the prefixes map corresponding to the second token of the rule. | ||
[71] | <ATPNAME_LN > |
::= | "@" PNAME_LN |
returns: IRI | A potentially empty prefix is identified by the second token, PNAME_NS . The prefixes map MUST have a corresponding namespace . The unicode string of the IRI is formed by unescaping the reserved characters [[!rfc7159]] in the third token, PN_LOCAL , and concatenating this onto the namespace . | ||
[72] | <REGEXP > |
::= | '/' ([^/\\\n\r] |
{ | pattern:STRING flags:STRING? } | ||
returns: JSON object | pattern is a unicode string formed from the characters between the outermost '/'s by unescaping matches of '\\' '/' in the terminal pattern as well as the numeric escape sequences matched by UCHAR.
The remaining escape sequences are included verbatim in pattern, e.g. ^\/\t\\\U0001D4B8$ ^/\t\\\U0001D4B8$ flags is a sequence of the characters [smix] if any were matched. Otherwise no flags attribute is returned. | ||
[142s] | <BLANK_NODE_LABEL > |
::= | "_:" (PN_CHARS_U | [0-9]) ((PN_CHARS | ".")* PN_CHARS)? |
returns: blank node | The characters following the "_: " form a blank node identifier. This corresponds to any blank node in the input dataset that had the same label. | ||
[145s] | <LANGTAG > |
::= | "@" ([a-zA-Z])+ ("-" ([a-zA-Z0-9])+)* |
returns: language tag | The characters following the @ form the unicode string of the language tag. | ||
[19t] | <INTEGER > |
::= | [+-]? [0-9]+ |
returns: literal | The literal has a lexical form of the input string, and a datatype of xsd:integer . | ||
[20t] | <DECIMAL > |
::= | [+-]? [0-9]* "." [0-9]+ |
returns: literal | The literal has a lexical form of the input string, and a datatype of xsd:double . | ||
[21t] | <DOUBLE > |
::= | [+-]? ([0-9]+ "." [0-9]* EXPONENT | "."? [0-9]+ EXPONENT) |
returns: literal | The literal has a lexical form of the input string, and a datatype of xsd:double . | ||
[155s] | <EXPONENT > |
::= | [eE] [+-]? [0-9]+ |
[156s] | <STRING_LITERAL1 > |
::= | "'" ([^'\\\n\r] | ECHAR | UCHAR)* "'" |
returns: lexical form | The characters between the outermost "'"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. | ||
[157s] | <STRING_LITERAL2 > |
::= | '"' ([^\"\\\n\r] | ECHAR | UCHAR)* '"' |
returns: lexical form | The characters between the outermost '"'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. | ||
[158s] | <STRING_LITERAL_LONG1 > |
::= | "'''" ( ("'" | "''")? ([^\\'\\] | ECHAR | UCHAR) )* "'''" |
returns: lexical form | The characters between the outermost "'''"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. | ||
[159s] | <STRING_LITERAL_LONG2 > |
::= | '"""' ( ('"' | '""')? ([^\"\\] | ECHAR | UCHAR) )* '"""' |
returns: lexical form | The characters between the outermost '"""'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. | ||
[73] | <LANG_STRING_LITERAL1 > |
::= | "'" ([^'\\\n\r] | ECHAR | UCHAR)* "'" LANGTAG |
returns: lexical form | The characters between the outermost "'"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. The trailing LANGTAG is used to create a language-tagged string. | ||
[74] | <LANG_STRING_LITERAL2 > |
::= | '"' ([^\"\\\n\r] | ECHAR | UCHAR)* '"' LANGTAG |
returns: lexical form | The characters between the outermost '"'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. The trailing LANGTAG is used to create a language-tagged string. | ||
[75] | <LANG_STRING_LITERAL_LONG1 > |
::= | "'''" ( ("'" | "''")? ([^\\'\\] | ECHAR | UCHAR) )* "'''" LANGTAG |
returns: lexical form | The characters between the outermost "'''"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. The trailing LANGTAG is used to create a language-tagged string. | ||
[76] | <LANG_STRING_LITERAL_LONG2 > |
::= | '"""' ( ('"' | '""')? ([^\"\\] | ECHAR | UCHAR) )* '"""' LANGTAG |
returns: lexical form | The characters between the outermost '"""'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. The trailing LANGTAG is used to create a language-tagged string. | ||
[26t] | <UCHAR > |
::= | "\\u" HEX HEX HEX HEX |
[160s] | <ECHAR > |
::= | "\\" [tbnrf\\\"\\'] |
[164s] | <PN_CHARS_BASE > |
::= | [A-Z] | [a-z] |
[165s] | <PN_CHARS_U > |
::= | PN_CHARS_BASE | "_" |
[167s] | <PN_CHARS > |
::= | PN_CHARS_U | "-" | [0-9] |
[168s] | <PN_PREFIX > |
::= | PN_CHARS_BASE ( (PN_CHARS | ".")* PN_CHARS )? |
[77] | <PN_LOCAL > |
::= | (PN_CHARS_U | ":" | [0-9] | PLX) ((PN_CHARS | "." | ":" | PLX)* (PN_CHARS | ":" | PLX))? |
[170s] | <PLX > |
::= | PERCENT | PN_LOCAL_ESC |
[171s] | <PERCENT > |
::= | "%" HEX HEX |
[172s] | <HEX > |
::= | [0-9] | [A-F] | [a-f] |
[173s] | <PN_LOCAL_ESC > |
::= | "\\" ( "_" | "~" | "." | "-" | "!" | "$" | "&" | "'" | "(" | ")" | "*" | "+" | "," | ";" | "=" | "/" | "?" | "#" | "@" | "%" ) |
[98] | PASSED TOKENS |
::= | [ \t\r\n]+ |
This section aggregates the JSON grammar rules defined above and includes terminals referenced above.
A ShExJ document is a JSON-LD [[!JSON-LD]] document which uses a proscribed structure to define a schema containing shape expressions and triple expressions.
A ShExJ document MAY include an @context
property referencing http://www.w3.org/ns/shex.jsonld
.
In the absense of a top-level @context
, ShEx Processors MUST act as if a @context
property is present with the value http://www.w3.org/ns/shex.jsonld
.
A ShExJ document can also be thought of as the serialization of an RDF Graph using the Shape Expression Vocabulary [[shex-vocab]] which conforms to the shape defined in . Processors MAY interpret a ShExJ document as an RDF Graph. Processors may also transform arbitrary RDF Graphs conforming to into ShExJ using a mechanism not described within this specification.
In ShExJ, the unbounded cardinality constraint is -1
, rather than "*"
.
This is the complete grammar for ShExJ.
Schema | { | "@context":"http://www.w3.org/ns/shex.jsonld"? imports:[IRIREF+]? startActs:[SemAct+]? start:shapeExpr? shapes:[shapeExpr+]? } |
---|---|---|
shapeExpr | = | ShapeOr | ShapeAnd | ShapeNot | NodeConstraint | Shape | ShapeExternal | shapeExprRef ; |
ShapeOr | { | id:shapeExprLabel? shapeExprs:[shapeExpr{2,}] } |
ShapeAnd | { | id:shapeExprLabel? shapeExprs:[shapeExpr{2,}] } |
ShapeNot | { | id:shapeExprLabel? shapeExpr:shapeExpr } |
ShapeExternal | { | id:shapeExprLabel? } |
shapeExprRef | = | shapeExprLabel ; |
shapeExprLabel | = | IRIREF | BNODE ; |
NodeConstraint | { | id:shapeExprLabel? nodeKind:("iri" | "bnode" | "nonliteral" | "literal")? datatype:IRIREF? xsFacet* values:[valueSetValue+]? } |
xsFacet | = | stringFacet | numericFacet ; |
stringFacet | = | (length|minlength|maxlength):INTEGER | pattern:STRING flags:STRING? ; |
numericFacet | = | (mininclusive|minexclusive|maxinclusive|maxexclusive):numericLiteral |
| | (totaldigits|fractiondigits):INTEGER ; | |
numericLiteral | = | INTEGER | DECIMAL | DOUBLE ; |
valueSetValue | = | objectValue | IriStem | IriStemRange | LiteralStem | LiteralStemRange | Language | LanguageStem | LanguageStemRange ; |
objectValue | = | IRIREF | ObjectLiteral ; |
ObjectLiteral | { | value:STRING language:STRING? type:STRING? } |
IriStem | { | stem:IRIREF } |
IriStemRange | { | stem:(IRIREF | Wildcard) exclusions:[IRIREF|IriStem+]? } |
LiteralStem | { | stem:STRING } |
LiteralStemRange | { | stem:(STRING | Wildcard) exclusions:[STRING|LiteralStem+]? } |
Language | { | languageTag:LANGTAG } |
LanguageStem | { | stem:(LANGTAG | EMPTY) } |
LanguageStemRange | { | stem:(LANGTAG | EMPTY) exclusions:[LANGTAG|LanguageStem+]? } |
Wildcard | { | /* empty */ } |
Shape | { | id:shapeExprLabel? closed:BOOL? extra:[IRIREF+]? expression:tripleExpr? semActs:[SemAct+]? annotations:[Annotation+]? } |
tripleExpr | = | EachOf | OneOf | TripleConstraint | tripleExprRef ; |
EachOf | { | id:tripleExprLabel? expressions:[tripleExpr{2,}] min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } |
OneOf | { | id:tripleExprLabel? expressions:[tripleExpr{2,}] min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } |
TripleConstraint | { | id:tripleExprLabel? inverse:BOOL? predicate:IRIREF valueExpr:shapeExpr? min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? } |
tripleExprRef | = | tripleExprLabel ; |
tripleExprLabel | = | IRIREF | BNODE ; |
SemAct | { | name:IRIREF code:STRING? } |
Annotation | { | predicate:IRIREF object:objectValue } |
# Terminals | These follow the rules for terminals in the XML 1.0 5th Edition | |
# | Turtle IRIREF without enclosing "<>"s | |
IRIREF | : | (PN_CHARS | '.' | ':' | '/' | '\\' | '#' | '@' | '%' | '&' | UCHAR)* ; |
# | Turtle BLANK_NODE_LABEL | |
BNODE | : | '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)? ; |
# | JSON boolean values | |
BOOL | : | "true" | "false" ; |
# | Turtle INTEGER | |
INTEGER | : | [+-]? [0-9] + ; |
# | Turtle DECIMAL | |
DECIMAL | : | [+-]? [0-9]* '.' [0-9] + ; |
# | Turtle DOUBLE | |
DOUBLE | : | [+-]? ([0-9] + '.' [0-9]* EXPONENT | '.' [0-9]+ EXPONENT | [0-9]+ EXPONENT) ; |
# | BCP47 Language-Tag | |
LANGTAG | : | [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ; |
# | any JSON string | |
STRING | : | .* ; |
# | empty string | |
EMPTY | : | ^$ ; |
# Components | These terminals are referenced by other terminals but not by external productions. | |
PN_CHARS_BASE | : | [A-Z] | [a-z] | [\u00C0-\u00D6] | [\u00D8-\u00F6] | [\u00F8-\u02FF] | [\u0370-\u037D] | [\u037F-\u1FFF] | [\u200C-\u200D] | [\u2070-\u218F] | [\u2C00-\u2FEF] | [\u3001-\uD7FF] | [\uF900-\uFDCF] | [\uFDF0-\uFFFD] | [\u10000-\uEFFFF] ; |
PN_CHARS | : | PN_CHARS_U | '-' | [0-9] | '\u00B7' | [\u0300-\u036F] | [\u203F-\u2040] ; |
PN_CHARS_U | : | PN_CHARS_BASE | '_' ; |
UCHAR | : | '\\u' HEX HEX HEX HEX | '\\U' HEX HEX HEX HEX HEX HEX HEX HEX ; |
HEX | : | [0-9] | [A-F] | [a-f] ; |
EXPONENT | : | [eE] [+-]? [0-9]+ ; |
A ShExR graph is any RDF Graph which conforms to this Shape Expressions schema and meets the Schema Requirements. Every ShExR document is graph isomorphic[[!rdf11-concepts]] to the RDF interpretation of some ShExJ document.
This section has been submitted to the Internet Engineering Steering Group (IESG) for review, approval, and registration with IANA.
Revealing the structure of an RDF graph can reveal information about the content of conformant data. For instance, a schema with a predicate to describe cancer stage indicates that conforming graphs describe patients with cancer.
The process of testing a graph's conformance to a schema may involve many detailed queries which could draw resources to respond to API calls or SPARQL queries.
ShEx has an extension mechanism which can, in principle, evalute arbitrary code, possibly as some trusted agent. Such extensions should not be executed if they don't come from a trusted source.
Since ShEx is intended to be a pure data exchange format for validating RDF graphs, the ShExJ serialization SHOULD NOT be passed through a code execution mechanism such as JavaScript's eval()
function to be parsed.
An (invalid) document may contain code that, when executed, could lead to unexpected side effects compromising the security of a system.