image/svg+xml :p1234567 Patient:Аника PatientID:1234567 BloodPressureReading: Systolic: SystolicUnits:mmHg SystolicValue:110 Diastolic: DiastolicUnits:mmHg DiastolicValue:70 BloodPressureReading: Systolic: SystolicUnits:mmHg SystolicValue:110 Diastolic: DiastolicUnits:mmHg DiastolicValue:70 :Patient-1234567 personName:Аника personID:1234567 :Patient-1234567.Obs1 code:BloodPressure subjectID:1234567 related: code:SystolicBloodPressure quantity: units:mmHg value:110 related: code:DiastolicBloodPressure quantity: units:mmHg value:70 :Patient-1234567.Obs2 code:BloodPressure subjectID:1234567 related: code:SystolicBloodPressure quantity: units:mmHg value:110 related: code:DiastolicBloodPressure quantity: units:mmHg value:70

This document motivates intuitive bridges between conventional clinical data representations and domain-specific ontologies that will be useful for knowledge/rule capture.

Generic Observations

Most clinical data exchange denormalizes structured observations like blood pressure, APGAR, full blood count panel, etc into a constellation of observations. This encodes the semantics of the observation structure in conventions of terminology codes, e.g. 75367002| Blood pressure | , 271649006| Systolic blood pressure | and 271650006 | Diastolic blood pressure |. Linkage between these are captured by some over-general predicates in the information model, e.g. fhir:related or rim:COMP - has component.

RIM resuses Act and ActRelationships to capture these structures, providing a richer vocabulary of relationships captured in ActRelationships type codes. The body site or device for a blood pressure measurement are attached to the blood pressure observation by type codes like Diagnostic processes like evidence and causality are captured in type codes EVID - provides evidence for and CAUS - is etiology for. FHIR has specialized relationships to capture some structural relationships like body site or device.

Even if we imposed a more complex information model for structured observations or treatment processes, there would always be some stuff for which there was no defined model. The semweb story "just invent some stuff and maybe it will get popular" isn't well suited to either the skills of the information wranglers or conventional paper-oriented clinical and legal processes.

Domain Models

Why do we need domain models?

Domain models will likely be more principled in their design and will definitely be more intuitive the physical models involving constellations of observations. A DAM is an example of a domain model which is tailored towards capturing the aspects which are required for analysis, e.g., aspects pertinent to workflows or business processes. A class and relationship hierarchy such as that implemented in ActRelationship type codes will be more simply expressed in an RDF ontology and will leverage existing RDF tooling for model and example verification.

Available DAMs

Projects like CIMI engage clinicians in the development of intuitive domain models, e.g. a blood pressure and a structure with a measurement of a systolic and distolic pressure, and maybe some other stuff like posture or device. Here are a few CIMI models:

DAM as Ontology

Ideally, development of little DAMs would include identifying all of the intersections between them, creating a more comprehensive DAM, and ontology of clinical artifacts. The body mass used in an estimated glomerular filtration rate is the same as the body mass used in a body mass index. A useful clinical ontology in RDF would capture that by reusing the same identifiers wherever the same concept was reused.

Mappings

Why do we need mappings?

Mapping mechanism?

As the models are described as ShEx schemas, the mappings between them are captured as shared "variables" in a %map:{ %} extension. These variables are given full URLs which enables trivial disambiguation, as well as leveraging standard prefix conventions for easier lexical categorization.

Blood pressure example

BP DAM

BP units DAM
<BPunitsDAM> {
    :systolic {
        :value xsd:float %map:{ bp:sysVal %},
        :units xsd:string %map:{ bp:sysUnits %}
    },
    :diastolic {
        :value xsd:float %map:{ bp:diaVal %},
        :units xsd:string %map:{ bp:diaUnits %}
    }
}
            
BP normalized DAM
<BPnormalizeDAM> {
    a (:CanonicalBloodPressure),
    :systolicBPmmHg xsd:float %map:{ cast(bp:sysVal, bp:sysUnits, "mmHg") %},
    :diastolicBPmmHg xsd:float %map:{ cast(bp:diaUnits, bp:diaUnits, "mmHg") %}
}
            

BP FHIR

<BPfhir> {
    a (fhir:Observation)?,
    fhir:coding { fhir:code (sct:Blood_Pressure) },
    fhir:related { fhir:type ("has-component"), fhir:target @<sysBP> },
    fhir:related { fhir:type ("has-component"), fhir:target @<diaBP> }
}
<sysBP> {
    a (fhir:Observation)?,
    fhir:coding { fhir:code (sct:Systolic_Blood_Pressure) },
    fhir:valueQuantity {
        a (fhir:Quantity)?,
        fhir:value xsd:float %map:{ bp:sysVal %},
        fhir:units xsd:string %map:{ bp:sysUnits %}
    },
}
<diaBP> {
    a (fhir:Observation)?,
    fhir:coding { fhir:code (sct:Diastolic_Blood_Pressure) },
    fhir:valueQuantity {
        a (fhir:Quantity)?,
        fhir:value xsd:float %map:{ bp:diaVal %},
        fhir:units xsd:string %map:{ bp:diaUnits %}
    },
}

Mechanism

this is very rough, don't work hard to look for cleverness where it's probably just wrong.

  1. Validate instance document with respect to source schema (say BP FHIR)
    produces variable bindings
  2. Invoke generate with the start shape of the target schema (say BP DAM) and a fresh bnode
  3. The shape is S and the current subject is B.
  4. for each property P:
    1. if the property is a reference to shape R, create a fresh bnode Bchild, assert (B, P, Bchild), invoke generate with the shape R and the subject Bchild.
    2. if the element is a leaf node (scalar value)
      if the element has a variable and the variable is bound to V, assert (B, P, V)
  5. The resulting graph of assertions should validate as the target schema.

Multiplicity

Reallistic mapping use cases involve a mixture of unary and n-ary properties. For instance, a patient record may contain a sequence of vitals like blood pressure. Taking again FHIR as the target schema, the mapping from a single source record to multiple target records requires repeated uses of some properties for each instantiation of the repeated properties. The example below shows sample instance data in both textual and tree representations. The trees are essentially the validation result format defined in ShExJ Validation Results.

Legend: properties in schema matched or implied triples

This can be viewed as mappig between two trees:

  • patient record :p1234567:Patient-1234567
  • blood pressure reading 1:Patient-1234567.Obs1
  • blood pressure reading 2:Patient-1234567.Obs2

Materialization of :Patient-1234567 entailed mapping the unary properties from :p1234567. This could be accomplished by ignoring n-ary properties like BloodPressureReading or with some form of externally-supplied cut rule passed to the generate function.

Materialization of :Patient-1234567.Obs1 and :Patient-1234567.Obs2 entailed repeated instantiating of the PatientID property. Possible rules for this mapping:

  • Instantiate a shape n time where there are n unique tree providing bindings, i.e. reading 1 and reading 2.
  • Repeat variables bound exactly once (or n times?) in cousins.
  • If there's a mismatch in cardinalities of require attributes, reject.

Issues

ideas from Claude Nanjo and Mohammad Hekmatnejad:
edited by: Eric Prud'hommeaux $id$