SPSH Documentation

Abstract

This document presents a linked data ontology that formalizes SPARQL 1.1 functions as instances of sh:SPARQLFunction, based on the SHACL Advanced Features. Each SPARQL function, mathematical operation, and symbol is uniquely instantiated and described using labels, definitions, and references from the SPARQL 1.1 Query Language.

The ontology is intended for users who aim to standardize and structure SPARQL queries within sh:TripleRule instances. By providing predefined, standardized function instances, the need for hard-coded SPARQL queries is minimized, enhancing interoperability, reusability, machine readability, and the universality of self-descriptive SPARQL rules in SHACL.

Status of the document

This document is not a normative specification. It was created as part of exploratory work and is published for documentation and general reference purposes.

The content may evolve over time and may be updated, modified, or reorganised without notice. The examples and explanations provided here illustrate possible approaches and should not be interpreted as defining a standard or mandatory implementation.

This document is publicly available through GitHub for convenience and transparency.

Copyright © 2026 the document editors/authors. This document is licensed under the Creative Commons Attribution–NonCommercial 4.0 International License (CC BY-NC 4.0). You may copy, distribute, and adapt this work for non-commercial purposes provided that appropriate attribution is given. Commercial use requires prior permission from the authors.

Introduction

This document defines a linked data ontology that formalizes SPARQL 1.1 functions as instances of sh:SPARQLFunction within the SHACL Advanced Features framework.

The ontology ensures interoperability and reusability by providing a consistent and machine-readable way to represent SPARQL functions as part of SHACL rule-based triple generation.

This spec is intended for users working with SHACL, SPARQL, or linked data systems who seek to standardize SPARQL query usage within SHACL rules.

Scope

This ontology formalizes existing SPARQL 1.1 functions, operators, and symbols and does not introduce new SPARQL constructs.The scope of the SPSH ontology is limited to the SPARQL functions, operators, and symbols listed in Section 4.5 – SPARQL Functions. Not all of these are illustrated through use cases. The examples in Section 5 – Examples focus on a representative subset.

Conventions

The table below indicates the full list of namespaces and prefixes used in this document.

Prefix	Namespace
ex	http://example.com/
owl	http://www.w3.org/2002/07/owl#
rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs	http://www.w3.org/2000/01/rdf-schema#
sh	http://www.w3.org/ns/shacl#
spsh	https://w3id.org/spsh/
xsd	http://www.w3.org/2001/XMLSchema#

For readability, the following color-coded boxes are used in the examples throughout this document:

# This block contains definitions from the SPSH ontology, typically written in Turtle format.

# This box represents the input shapes graph, typically written in Turtle format. # The shapes graph may contain among others instances of sh:Shape, sh:Rule and the instances of the sh:SPARQLFunction which are defined in the SPSH ontology.

# This box representes the input data graph, typically written in Turtle format. # Elements highlighted in green, are focus nodes. # Elements highlighted in red, are focus nodes that fail inference for any reason.

# This box representes the output data graph (inferenced data), typically written in Turtle format.

# This block represent a potential SPARQL representation of the shapes graph, typically written in the SPARQL language.

Conformance

This document is entirely non-normative. It does not define a specification, standard, or mandatory implementation.

The concepts, examples, and modelling patterns described here are provided for explanatory and illustrative purposes only.

SPSH ontology

This ontology is instance-centric and intentionally does not define a domain class model or class hierarchy. Each resource described in this ontology is a concrete representation of a SPARQL 1.1 function, operator or symbol and it is modeled as an instance of sh:SPARQLFunction.

classDiagram direction LR class NodeExpression { -SPARQLFunction : args } class TripleRule { -subject : NodeExpression -predicate : NodeExpression -object : NodeExpression } class NodeShape { -rule : TripleRule } TripleRule "0" --> "1" NodeExpression : subject TripleRule "0" --> "1" NodeExpression : predicate TripleRule "0" --> "1" NodeExpression : object NodeShape "0" --> "1" TripleRule : rule

The diagram above shows how Node Shapes, Triple Rules, and Node Expressions are used to generate new data. It omits other elements for clarity and focuses specifically on the part relevant to the SPSH ontology.The values of sh:subject, sh:predicate, and sh:object are Node Expressions. As it is highlighted, node expressions might be instances of sh:SPARQLFunction such as the ones found in the SPSH ontology.

Modeling approach

Each instance in the SPSH ontology represents uniquely a SPARQL function, operator or symbol found in the SPARQL 1.1 Query language. There is a consistent set of metadata properties that describe:

basic identity (type, label, comment, seeAlso),
parameter metadata (order, name, path, datatype, optional),
execution-related information (prefixes, return type, and a SELECT query).

All instances of SPARQL functions are with capital letters for consistency with SPARQL documentation.

SPARQL Function structure

Each instance of sh:SPARQLFunction follows this structure:

Property	Cardinality	Description
`rdf:type`	1..1	`sh:SPARQLFunction`
`rdfs:label`	1..1	Human readable name, e.g. "SPARQL Function STRSTARTS".
`rdfs:comment`	1..1	Description taken from the W3C SPARQL 1.1 specification.
`rdfs:seeAlso`	1..1	URL pointing to the W3C SPARQL 1.1 specification.
`sh:parameter`	0..*	List of parameter descriptions (detailed below). The quantity of the parameters depends on each SPARQL function.
`sh:returnType`	0..1	Expected type of the result (e.g. `xsd:boolean`), when applicable.
`sh:select`	1..1	A SPARQL query that represents the SPARQL function's behavior.

SPARQL Function Parameter structure

Each parameter of a SPARQL Function is represented as a SHACL parameter node and uses the following properties:

Property	Cardinality	Description
`sh:path`	1..1	The argument name in the function's query, using the convention `spsh:arg_N` (e.g. `?arg_1`).
`sh:datatype`	0..1	The expected RDF datatype for the parameter (e.g. `xsd:string`).
`sh:name`	0..1	The parameter name from the SPARQL spec, when provided.
`sh:optional`	0..1	`True` if the parameter is optional according to SPARQL semantics.
`sh:order`	1..1	An integer indicating the parameter position (1-based).

Example

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix sh: <http://www.w3.org/ns/shacl#> . @prefix spsh: <https://w3id.org/spsh/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . spsh:STRSTARTS rdf:type sh:SPARQLFunction ; rdfs:comment "The STRSTARTS function corresponds to the XPath fn:starts-with function. The arguments must be argument compatible otherwise an error is raised. For such input pairs, the function returns true if the lexical form of arg1 starts with the lexical form of arg2, otherwise it returns false." ; rdfs:label "SPARQL function STRSTARTS" ; rdfs:seeAlso "https://www.w3.org/TR/sparql11-query/#func-strstarts"^^xsd:anyUri ; sh:parameter [ sh:path spsh:arg_1 ; sh:name "arg1" ; sh:optional "true"^^xsd:boolean ; sh:order 1 ; ] ; sh:parameter [ sh:path spsh:arg_2 ; sh:name "arg2" ; sh:optional "true"^^xsd:boolean ; sh:order 2 ; ] ; sh:returnType xsd:boolean ; sh:select """ SELECT DISTINCT ?result WHERE { BIND ( STRSTARTS ( ?arg_1 , ?arg_2 ) AS ?result ) } """ ; .

SPARQL Functions

The following SPARQL 1.1 functions, operators, and symbols are represented as instances of sh:SPARQLFunction in this ontology. Their behaviour and semantics follow the SPARQL 1.1 specification and are not redefined here.The SPARQL functions are split into three different tables: functions, operators and symbols. This split is only for readability purposes. All instances of the SPSH ontology use the same overall instance pattern and properties.

SPARQL Functions

	SPSH	SPARQL Function
	`spsh:ABS` ( :arg_1 )	`ABS` ( numeric term )
	`spsh:BNODE` ( ) `spsh:BNODE` ( :arg_1 ) `spsh:BNODE` ( :arg_1 )	`BNODE` ( ) `BNODE` ( simple literal ) `BNODE` ( xsd:string )
	`spsh:BOUND` ( :arg_1 )	`BOUND` ( variable var )
	`spsh:CEIL` ( :arg_1 )	`CEIL` ( numeric term )
*	`spsh:COALESCE` ( :arg_1 :arg_2 )	`COALESCE` ( expression, ... )
*	`spsh:CONCAT` ( :arg_1 :arg_2 )	`CONCAT` ( string literal ltrl1 ... string literal ltrln )
	`spsh:CONTAINS` ( :arg_1 :arg_2 )	`CONTAINS` ( string literal arg1, string literal arg2 )
	`spsh:DATATYPE` ( :arg_1 )	`DATATYPE` ( literal literal )
	`spsh:DAY` ( :arg_1 )	`DAY` ( xsd:dateTime arg )
	`spsh:ENCODE_FOR_URI` ( :arg_1 )	`DAY` ( string literal ltrl )
***	`spsh:EXISTS` ( :arg_1 :arg_2 :arg_3 )	`EXISTS` { pattern }
	`spsh:FLOOR` ( :arg_1 )	`FLOOR` ( numeric term )
	`spsh:HOURS` ( :arg_1 )	`HOURS` ( xsd:dateTime arg )
	`spsh:IF` ( :arg_1 :arg_2 :arg_3 )	`IF` ( expression1, expression2, expression3 )
**	`spsh:IN` ( :arg_1 :arg_2 )	`IN` ( expression, ... )
	`spsh:IRI` ( :arg_1 ) `spsh:IRI` ( :arg_1 ) `spsh:IRI` ( :arg_1 ) `spsh:URI` ( :arg_1 ) `spsh:URI` ( :arg_1 ) `spsh:URI` ( :arg_1 )	`IRI` ( simple literal ) `IRI` ( xsd:string ) `IRI` ( iri ) `URI` ( simple literal ) `URI` ( xsd:string ) `URI` ( iri )
	`spsh:ISBLANK` ( :arg_1 )	`ISBLANK` ( RDF term term )
	`spsh:ISIRI` ( :arg_1 ) `spsh:ISURI` ( :arg_1 )	`ISIRI` ( RDF term term ) `ISURI` ( RDF term term )
	`spsh:ISLITERAL` ( :arg_1 )	`ISLITERAL` ( RDF term term )
	`spsh:ISNUMERIC` ( :arg_1 )	`ISNUMERIC` ( RDF term term )
	`spsh:LANG` ( :arg_1 )	`LANG` ( literal ltrl )
	`spsh:LANGMATCHES` ( :arg_1 :arg_2 )	`LANGMATCHES` ( simple literal language-tag, simple literal language-range )
	`spsh:LCASE` ( :arg_1 )	`LCASE` ( string literal str )
	`spsh:MD5` ( :arg_1 ) `spsh:MD5` ( :arg_1 )	`MD5` ( simple literal arg ) `MD5` ( xsd:string arg )
	`spsh:MINUTES` ( :arg_1 )	`MINUTES` ( xsd:dateTime arg )
	`spsh:MONTH` ( :arg_1 )	`MONTH` ( xsd:dateTime arg )
***	`spsh:NOT_EXISTS` ( :arg_1 :arg_2 :arg_3 )	`NOT EXISTS` { pattern }
**	`spsh:NOT_IN` ( :arg_1 :arg_2 )	`NOT IN` ( expression, ... )
	`spsh:NOW` ( )	`NOW` ( )
	`spsh:RAND` ( )	`RAND` ( )
	`spsh:REGEX` ( :arg_1 :arg_2 ) `spsh:REGEX` ( :arg_1 :arg_2 :arg_3 )	`REGEX` ( string literal text, simple literal pattern ) `REGEX` ( string literal text, simple literal pattern, simple literal flags )
	`spsh:REPLACE` ( :arg_1 :arg_2 :arg_3) `spsh:REPLACE` ( :arg_1 :arg_2 :arg_3 :arg_4 )	`REPLACE` ( string literal arg, simple literal pattern, simple literal replacement ) `REPLACE` ( string literal arg, simple literal pattern, simple literal replacement, simple literal flags )
	`spsh:ROUND` ( :arg_1 )	`ROUND` ( numeric term )
	`spsh:SAMETERM` ( :arg_1 :arg_2 )	`SAMETERM` ( RDF term term1, RDF term term2 )
	`spsh:SECONDS` ( :arg_1 )	`SECONDS` ( xsd:dateTime arg )
	`spsh:SHA1` ( :arg_1 ) `spsh:SHA1` ( :arg_1 )	`SHA1` ( simple literal arg ) `SHA1` ( xsd:string arg )
	`spsh:SHA256` ( :arg_1 ) `spsh:SHA256` ( :arg_1 )	`SHA256` ( simple literal arg ) `SHA256` ( xsd:string arg )
	`spsh:SHA384` ( :arg_1 ) `spsh:SHA384` ( :arg_1 )	`SHA384` ( simple literal arg ) `SHA384` ( xsd:string arg )
	`spsh:SHA512` ( :arg_1 ) `spsh:SHA512` ( :arg_1 )	`SHA512` ( simple literal arg ) `SHA512` ( xsd:string arg )
	`spsh:STR` ( :arg_1 ) `spsh:STR` ( :arg_1 )	`STR` ( literal ltrl ) `STR` ( IRI rsrc )
	`spsh:STRAFTER` ( :arg_1 :arg_2 )	`STRAFTER` ( string literal arg1, string literal arg2 )
	`spsh:STRBEFORE` ( :arg_1 :arg_2 )	`STRBEFORE` ( string literal arg1, string literal arg2 )
	`spsh:STRDT` ( :arg_1 :arg_2 )	`STRDT` ( simple literal lexicalForm, IRI datatypeIRI )
	`spsh:STRENDS` ( :arg_1 :arg_2 )	`STRENDS` ( string literal arg1, string literal arg2 )
	`spsh:STRLANG` ( :arg_1 :arg_2 )	`STRLANG` ( simple literal lexicalForm, simple literal langTag )
	`spsh:STRLEN` ( :arg_1 )	`STRLEN` ( string literal str )
	`spsh:STRSTARTS` ( :arg_1 :arg_2 )	`STRSTARTS` ( string literal arg1, string literal arg2 )
	`spsh:STRUUID` ( )	`STRUUID` ( )
	`spsh:SUBSTR` ( :arg_1 :arg_2 ) `spsh:SUBSTR` ( :arg_1 :arg_2 :arg_3 )	`SUBSTR` ( string literal source, xsd:integer startingLoc ) `SUBSTR` ( string literal source, xsd:integer startingLoc, xsd:integer length )
	`spsh:TIMEZONE` ( :arg_1 )	`TIMEZONE` ( xsd:dateTime arg )
	`spsh:TZ` ( :arg_1 )	`TZ` ( xsd:dateTime arg )
	`spsh:UCASE` ( :arg_1 )	`UCASE` ( string literal str )
****	`spsh:UNDEF` ( )	`UNDEF` ( )
	`spsh:UUID` ( )	`UUID` ( )
	`spsh:YEAR` ( :arg_1 )	`YEAR` ( xsd:dateTime arg )

* SPARQL functions CONCAT and COALESCE expect a list of unknown parameters and therefore cannot be modeled in the same way as sh:SPARQLFunction. Consequently, two values are expected. For three or more values, nested sh:SPARQLFunction must be used, e.g. [ spsh:COALESCE ( :arg_1, [ spsh:COALESCE ( :arg_2 :arg_3 ) ] ) ].

** Similar to the SPARQL functions CONCAT and COALESCE, the SPARQL functions IN and NOT IN expect a list of unknown parameters and therefore cannot be modeled in the same way as sh:SPARQLFunction. Consequently, two values are expected, where the first represents the parameter to be evaluated. For more than one value in the list of spsh:IN, spsh:OR can be used: [ spsh:OR ( [ spsh:IN (:arg_1 :arg_2 ) ] [ spsh:IN ( :arg_1 :arg_3 ) ] ) ]. For more than one value in the list of spsh:NOT_IN, spsh:AND can be used: [ spsh:AND ( [ spsh:NOT_IN (:arg_1 :arg_2 ) ] [ spsh:NOT_IN ( :arg_1 :arg_3 ) ] ) ].

*** The SPARQL functions EXISTS and NOT EXISTS expect a pattern. Only simple patterns can be expressed using three parameters corresponding to a triple's subject, predicate, and object.

**** UNDEF is used in SPARQL in VALUES to represent an undefined value. Here it can be used as a placeholder for the “undefined” value in SPARQL functions.

SPARQL Operators

	SPSH	SPARQL Operator
*	`spsh:ADD` ( :arg_1 :arg_2 )	`A + B`
*	`spsh:DIVIDE` ( :arg_1 :arg_2 )	`A / B`
	`spsh:EQUALS` ( :arg_1 :arg_2 )	`A = B`
	`spsh:GREATER_THAN` ( :arg_1 :arg_2 )	`A > B`
	`spsh:GREATER_THAN_OR_EQUALS` ( :arg_1 :arg_2 )	`A ≥ B`
	`spsh:LESS_THAN` ( :arg_1 :arg_2 )	`A < B`
	`spsh:LESS_THAN_OR_EQUALS` ( :arg_1 :arg_2 )	`A ≤ B`
*	`spsh:MULTIPLY` ( :arg_1 :arg_2 )	`A * B`
	`spsh:NOT_EQUALS` ( :arg_1 :arg_2 )	`A ≠ B`
*	`spsh:SUBTRACT` ( :arg_1 :arg_2 )	`A - B`

* SPARQL allows multiple mathematical operators to appear consecutively. As the number of parameters is unknown, a series of nested functions is required. For example: [ spsh:ADD ( [ spsh:ADD (:arg_1 :arg_2) ] :arg_3 ) ].

SPARQL Symbols

	SPSH	SPARQL Symbol
*	`spsh:AND` ( :arg_1 :arg_2 )	`left && right`
	`spsh:EXCLAMATION_MARK` ( :arg_1 )	`!`
*	`spsh:OR` ( :arg_1 :arg_2 )	`left \|\| right`

* SPARQL allows multiple logical operators ( || and && ) to appear consecutively. As the number of parameters is unknown, a series of nested functions is required. For example: [ spsh:OR ( [ spsh:OR ( :arg_1 :arg_2 ) ] :arg_3 ) ].

Limitations

The following limitations apply:

OPTIONAL (see SPARQL 1.1 Query Language – Section 6) is not supported.
Matching alternative patterns (UNION) or negating patterns (MINUS) (see SPARQL 1.1 Query Language – Section 7 and SPARQL 1.1 Query Language – Section 8) are not supported.
Property paths (see SPARQL 1.1 Query Language – Section 9) are not supported. When needed, SHACL property paths can be used instead.
Assignment functions (BIND and VALUES) (see SPARQL 1.1 Query Language – Section 10) are not supported. However, all functions behave as implicit BIND operations.
FILTER is not supported.
Aggregation functions (see SPARQL 1.1 Query Language – Section 11) and the associated aggregate algebra functions (see SPARQL 1.1 Query Language – Section 18.5) are not supported.
Solution sequences and modifiers (see SPARQL 1.1 Query Language – Section 15) are not supported.
SPARQL functions with a variable number of parameters (e.g., COALESCE) are modeled differently from the standard SPARQL structure. For limitations and implementation details, see the functions marked with * in Section 4.5 – SPARQL Functions.

To use the SPARQL functionalities listed above within sh:rule, it is recommended to use sh:SPARQLRule, which allows the definition of a full SPARQL query, rather than sh:TripleRule.

Examples

This section demonstrates how the SPSH ontology can be used inside sh:TripleRule constructs.

Instances of sh:TripleRule allow new triples to be generated from existing data in a dataset. By defining rule-based logic within an instnace of a sh:NodeShape, it is possible to derive additional information, compute new values, or materialise relationships that are implicitly present in the data.

In this approach, a sh:NodeShape targets a set of focus nodes of interest. The associated rules (sh:rule) are then evaluated for each focus node, and when the rule conditions are satisfied, new triples can be generated.

SHACL supports different types of rules for this purpose. The sh:TripleRule provides a structured way of creating triples by explicitly defining the subject, predicate, and object of the generated statement. In contrast, sh:SPARQLRule offers more flexibility, allowing arbitrary SPARQL CONSTRUCT queries to generate new triples.

In the examples below, the focus is on the structured approach using sh:TripleRule. These examples demonstrate how new data can be derived using predefined rule components together with standardised SPARQL functions provided by the SPSH ontology.

Derived/Computed property

Computing a property value from existing data

In many datasets, certain values can be derived from other properties rather than being stored directly. Instead of requiring these values to be explicitly provided in the data, they can be computed automatically using SHACL rule-based constructs. This approach can improve data consistency and reduce redundancy by ensuring that derived values are calculated in a uniform way.

In the example below, a rule computes a person’s age from their recorded birth year. The rule applies to all instances of ex:Person. For each focus node, the value of ex:age is calculated by subtracting the value of ex:birthYear from the current year. This is expressed using a sh:TripleRule, where the subject of the generated triple is the focus node (sh:this), the predicate is ex:age, and the object is computed using a chain of functions.

Once these triples are generated, they may also influence validation or querying. For example, shapes or queries that rely on the presence of ex:age can operate on the computed values even if the age was not explicitly stored in the original dataset. This depends on the execution model of the SHACL engine, for example the rules must be executed before performing validation.

ex:PersonShape rdf:type sh:NodeShape ; sh:rule ex:AgeComputation ; sh:targetClass ex:Person . ex:AgeComputation rdf:type sh:TripleRule ; sh:subject sh:this ; sh:predicate ex:age ; sh:object [ spsh:SUBTRACT ( [ spsh:YEAR ( [ spsh:NOW ( ) ] ) ] [ sh:path ex:birthYear ] ) ] .

ex:Bob rdf:type ex:Person ; ex:birthYear "1990"^^xsd:integer . ex:Alice rdf:type ex:Person ; ex:birthYear "1990-01-01"^^xsd:date .

ex:Bob ex:age "36"^^xsd:integer .

PREFIX ex: <http://example.com/> INSERT { ?this ex:age ?result . # ?this is the focus node } WHERE { BIND ( NOW ( ) AS ?result_1 ) . BIND ( YEAR ( ?result_1 ) AS ?result_2 ) . BIND ( $targetNode AS ?this ) . # $targetNode is pre-bound to ex:Bob ?this ex:birthYear ?result_3 . BIND ( ?result_2 - ?result_3 AS ?result ) . }

Default value computation

Inferring a default value based on a property value

In many datasets, some properties are frequently left empty because their values are either obvious from context or follow a domain-wide convention. Instead of requiring users to fill in these values manually, SHACL rule-based constructs can automatically populate them with default values. These defaults can be static or derived dynamically from other properties. This approach ensures data completeness, improves consistency, and reduces errors in validation or querying.

In the example below, a shipping company has a rule for its orders: the shipping method is by default standard unless the customer explicitly requests express shipping. However, for heavy parcels above 10 kilograms, the shipping becomes by default express. This scenario can be implemented as a sh:TripleRule, where the shipping method is computed dynamically using an IF function.

ex:OrderShape rdf:type sh:NodeShape ; sh:rule ex:ShippingMethodDefault ; sh:targetClass ex:Order . ex:ShippingMethodDefault rdf:type sh:TripleRule ; sh:condition [ sh:property [ sh:path ex:shippingMethod ; sh:maxCount 0 ; ] ] ; sh:subject sh:this ; sh:predicate ex:shippingMethod ; sh:object [ spsh:IF ( [ spsh:GREATER_THAN_OR_EQUALS ( [ sh:path ex:weight ] 10 ) ] "express" "standard" ) ] .

ex:Order_01 rdf:type ex:Order ; ex:weight "5"^^xsd:integer . ex:Order_02 rdf:type ex:Order ; ex:weight "12"^^xsd:integer . ex:Order_03 rdf:type ex:Order ; ex:weight "3"^^xsd:integer ; ex:shippingMethod "express" .

ex:Order_01 ex:shippingMethod "standard" . ex:Order_02 ex:shippingMethod "express" .

PREFIX ex: <http://example.com/> INSERT { ?this ex:shippingMethod ?result . } WHERE { BIND ( $targetNode AS ?this ) . FILTER NOT EXISTS { ?this ex:shippingMethod ?shipping_method . } # Apply sh:condition. ?this ex:weight ?weight . BIND ( IF ( ?weight >= 10 , "express" , "standard" ) AS ?result ) . }

Normalization

Unifying overlapping properties into a canonical property

In many datasets, the same conceptual value may be recorded in different properties, or in slightly different formats. This can happen when data comes from multiple sources, systems, or historical records. A normalization activity is the process of unifying these multiple representations into a single canonical property or format, ensuring consistency and simplifying querying, reporting, and downstream processing.

For example, a person’s last name might be recorded under different properties such as lastName, surname, and/or familyName. Rather than leaving these values scattered across the dataset, a normalization activity can select a single value on a preferred order, and store it in a canonical property, such as normalisedLastName. This ensures that queries and applications always see the same, consistent value for the last name, regardless of which original property it came from.

The example below shows how this can be implemented using a the SPARQL COALESCE function, which automatically picks the first available value from the prioritized list of properties.

Note on SPARQL Function with multiple parameters

Functions like COALESCE, CONCAT, and IN, support only two parameters at a time. In that case, it is suggested to nest multiple COALESCE functions to handle more values.

ex:PersonShape rdf:type sh:NodeShape ; sh:rule ex:FamilyNameComputation ; sh:targetClass ex:Person . ex:FamilyNameComputation rdf:type sh:TripleRule ; sh:subject sh:this ; sh:predicate ex:normalisedLastName ; sh:object [ spsh:COALESCE ( [ sh:path ex:lastName ] [ spsh:COALESCE ( [ sh:path ex:surname ] [ sh:path ex:familyName ] ) ] ) ] .

ex:Bob rdf:type ex:Person ; ex:surname "Doe" . ex:Alice rdf:type ex:Person ; ex:fatherName "Doe" .

ex:Bob ex:normalisedLastName "Doe" .

PREFIX ex: <http://example.com/> INSERT { ?this ex:normalisedLastName ?result . } WHERE { BIND ( $targetNode AS ?this ) . OPTIONAL { ?this ex:lastName ?lastName . } OPTIONAL { ?this ex:surname ?surname . } OPTIONAL { ?this ex:familyName ?familyName . } BIND ( COALESCE ( ?surname , ?familyName ) AS ?result_1 ) . BIND ( COALESCE ( ?lastName , ?result_1 ) AS ?result ) . }

Denormalization

Rolling up ordered values using sequence-based predicates

In some datasets, related values are represented as separate nodes that contain an explicit order or sequence number. While this structure provides flexibility for modelling ordered lists, it may require additional joins or path expressions when querying the data.

A denormalisation activity can simplify access to such information by rolling up these values directly onto the main resource. Using SHACL rule-based constructs, new predicates can be constructed dynamically based on the sequence number of each related value.

In the example below, an order contains several items, each associated with a sequence number. A SHACL rule generates predicates such as ex:item_1, ex:item_2, and ex:item_3, allowing the ordered items to be accessed directly from the order resource without traversing the intermediate nodes.

ex:ItemShape rdf:type sh:NodeShape ; sh:rule ex:ItemComputation ; sh:targetObjectsOf ex:hasItem . ex:ItemComputation rdf:type sh:TripleRule ; sh:subject [ sh:path [ sh:inversePath ex:hasItem ] ] ; sh:predicate [ spsh:IRI ( [ spsh:CONCAT ( [ spsh:STR ( ex:hasItem ) ] [ spsh:CONCAT ( "_" [ spsh:STR ( [ sh:path ex:sequence ] ) ] ) ] ) ] ) ] ; sh:object sh:this ; .

ex:Order_1 ex:hasItem ex:Item_1 , ex:Item_2 , ex:Item_3 . ex:Item_1 ex:sequence "1"^^xsd:integer . ex:Item_2 ex:sequence "2"^^xsd:integer . ex:Item_3 ex:sequence "3"^^xsd:integer .

ex:Order_1 ex:hasItem_1 ex:Item_1 . ex:Order_1 ex:hasItem_2 ex:Item_2 . ex:Order_1 ex:hasItem_3 ex:Item_3 .

PREFIX ex: <http://example.com/> INSERT { ?subject ?predicate ?this . } WHERE { BIND ( $targetNode AS ?this ) . ?this ^ex:hasItem ?subject . BIND ( STR ( ex:hasItem ) AS ?result_1 ) . ?this ex:sequence ?sequence . BIND ( STR ( ?sequence ) AS ?result_2 ) . BIND ( CONCAT ( "_" , ?result_2 ) AS ?result_3 ) . BIND ( CONCAT ( ?result_1 , ?result_3 ) AS ?result_4 ) . BIND ( IRI ( ?result_4 ) AS ?predicate ) . }