Recently I was asked by a customer to explain how ODM “ReferenceData” works.

As I do not use it myself a lot, I went back to the ODM v.1.3.1. specification. It states:

Reference data provides information on how to interpret clinical data. For example, reference data might include lab normal ranges”.

That's it…

ReferenceData's structure is very similar to ClinicalData's structure, except that ReferenceData has no “SubjectData”, no “StudyEventData”, and no “FormData”. It only has “ItemGroupData and ItemData”. So essentially it is transporting 2-dimensional data, i.e. tables. And I think that is also its primary usage: transport of tables between applications between about data for which there are no subjects, no visits and no forms.

So I started creating a very simple example:

  <ItemGroupData ItemGroupOID="IG.VSNORMALVALUES">
      <ItemData ItemOID="IT.DIABP_EXP" Value="80">
          <MeasurementUnitRef MeasurementUnitOID="MU.MMHG"/>
      </ItemData>
      <ItemData ItemOID="IT.SYSBP_EXP" Value="120">
          <MeasurementUnitRef MeasurementUnitOID="MU.MMHG"/>
      </ItemData>
      <ItemData ItemOID="IT.HR_EXP" Value="70">
          <MeasurementUnitRef MeasurementUnitOID="MU.BPM"/>
      </ItemData>
  </ItemGroupData>
  

Of course a reference data point only usually makes sense when it has units attached, otherwise we do not know what we exactly are talking about.

So, what does this say? Not much, except that it represents SOME table data. There is little information in it, as the OIDs are arbitrary, so we cannot rely them to have a meaning. For that we need metadata. Now the specification also states:

Since reference data can be independent of any particular study, it may be desirable to keep the reference metadata separate from clinical metadata. This can be done by creating a Study element with no Protocol, StudyEventDef, or FormDef elements. All the ItemGroupDefs would have IsReferenceData=Yes. Such a study would have no clinical data.”.

OK, let's do so. We create a study with OID “ReferenceDataStudy”, with a MetaDataVersion “MV.REFDATA” which ONLY contains ItemGroupDef elements and ItemDef elements, and maybe “CodeList” elements. For our example, such a file could look like (ODM element omitted here):

  <Study OID="ReferenceDataStudy">
      <GlobalVariables>...</GlobalVariables>
      <BasicDefinitions>
          <MeasurementUnit OID="MU.MMHG" Name="millimeter mercury">
              <Symbol>
                  <TranslatedText>mmHg</TranslatedText>
              </Symbol>
          </MeasurementUnit>
          <MeasurementUnit OID="MU.BPM" Name="beats per minute">
              <Symbol>
                  <TranslatedText>beats/min</TranslatedText>
              </Symbol>
          </MeasurementUnit>
      </BasicDefinitions>
      <MetaDataVersion OID="MV.REFDATA" Name="Study-metadata only containing reference data definitions">
          <!-- we only define ItemGroups here all with IsReferenceData="Yes" -->
          <ItemGroupDef OID="IG.VSNORMALVALUES" Name="Vital Signs Normal Values" Repeating="No" IsReferenceData="Yes">
              <ItemRef ItemOID="IT.SYSBP_NORM" Mandatory="No"/>
              <ItemRef ItemOID="IT.DIABP_NORM" Mandatory="No"/>
              <ItemRef ItemOID="IT.HR_NORM" Mandatory="No"/>
          </ItemGroupDef>
          <!-- correspobnding ItemDefs -->
          <ItemDef OID="IT.SYSBP_NORM" Name="Normal Systolic Blood Pressure" DataType="integer" Length="3">               
              <MeasurementUnitRef MeasurementUnitOID="MU.MMHG"/>
          </ItemDef>
          <ItemDef OID="IT.DIABP_NORM" Name="Normal Diastolic Blood Pressure" DataType="integer" Length="3">
              <MeasurementUnitRef MeasurementUnitOID="MU.MMHG"/>
          </ItemDef>
          <ItemDef OID="IT.HR_NORM" Name="Normal Heart Rate" DataType="integer" Length="3">
              <MeasurementUnitRef MeasurementUnitOID="MU.BPM"/>
          </ItemDef>
      </MetaDataVersion>
  </Study>

So this “study” only has definitions of reference data, no actual clincal data. Of course we can add the reference data in the same file, resulting in something like (only the main things displayed):

  <Study OID="ReferenceDataStudy">
      <MetaDataVersion OID="MV.REFDATA" Name="Study-metadata only containing reference data definitions">
          <!-- here come the definitions as shown before -->
      </MetaDataVersion>
  </Study>
  <ReferenceData StudyOID="ReferenceDataStudy" MetaDataVersionOID="MV.REFDATA">
      <!-- here come the ItemGroupData elements, as shown before -->
  </ReferenceData>
  

If we also want to transport the reference data together with a real study, we can easily do so for as well the definitions, as for the reference data themselves, the definitions being incorporated through the ”Include” element:

  <Study OID="MyRealStudy">
      <MetaDataVersion OID="MV.001" Name="first metadata version of study MyRealStudy">
          <Include StudyOID="ReferenceDataStudy" MetaDataVersionOID="MV.REFDATA" />
          <!-- here come all the normal Protocol, StudyEventDef, FormDef,
          ItemGroupDef, ItemDef, CodeList, ConditionDef, MethodDef, ... elements
          containing the design of our study -->
      </MetaDataVersion>
  </Study>
  <ReferenceData StudyOID="MyRealStudy" MetaDataVersionOID="MV.001">
      <!-- here come the ItemGroupData elements, as shown before -->
  </ReferenceData>
  <!-- possibly clinical data too ... -->
  <ClinicalData StudyOID="MyRealStudy" MetaDataVersionOID="MV.001">
      <!-- here come all subject related clinical data -->
  </ClinicalData>
  

Remark that “ReferenceData” is now referencing the Study with OID “MyRealStudy”, as also the reference data definitions have been imported into study “MyRealStudy”

What can we do with this?

Not much, except filling a table in the receiving system's data base.

What is missing, is the formal relationship with things like the “normal value”, or “upper expected value” and “lower expected value”. Of course we could rely on naming conventions like:

  <ItemDef OID="IT.SYSBP" Name="Really measured subject's systolic blood pressure" ...>...</ItemDef>
  

But it is something I do not like at all, as it is implementation dependent.

Or for normal ranges, in the “ReferenceData” section:

  <ItemData ItemOID="IT.SYSBP_LOWER_LIMIT" Value="100" />
  <ItemData ItemOID="IT.SYSBP_HIGHER_LIMIT Value="140" />
  

However, for a real implementation, we do have the ”RangeCheck” element on “ItemDef”. For example:

  <ItemDef OID="IT.SYSBP" Name="Systolic blood pressure" ...>
      <RangeCheck Comparator="GE" SoftHard="Soft">
          <CheckValue>100</CheckValue>
          <MeasurementUnitRef MeasurementUnitOID="MU.MMHG"/>
          <ErrorMessage><TranslatedText>The value is below 100, are you sure?</TranslatedText></ErrorMessage>
      </RangeCheck>
      <RangeCheck Comparator="LE" SoftHard="Soft">
          <CheckValue>140</CheckValue>
          <MeasurementUnitRef MeasurementUnitOID="MU.MMHG"/>
          <ErrorMessage><TranslatedText>The value is below 100, are you sure?</TranslatedText></ErrorMessage>
      </RangeCheck>
  </ItemDef>

stating that the value for the blood pressure is expected to be between 100 and 140 mmHg, otherwise a warning is issued (“soft”).

The use case might however be another than for “ReferenceData”. Whereas “RangeCheck” is probably mostly used for checks at the data input level, the use case for “ReferenceData” may be another.

Maybe we should extend ItemDef with something as follows (and maybe EDC/CTMS vendors already have done so):

  <ItemDef OID="IT.SYSBP" Name="Systolic blood pressure" ...>
      <!-- the usual RangeCheck elements for data entry checking -->
      <RangeCheck ...>...</RangeCheck>
      <RangeCheck ...>...</RangeCheck>
      <!-- new stuff referencing "ReferenceData" -->
      <ReferenceDataRef ReferenceDataOID="IT.SYSBP_NORM" Comparator="EQ" Interpretation="Normal"/>
      <ReferenceDataRef ReferenceDataOID="IT.SYSBP_LOWER_LIMIT" Comparator="LE" Interpretation="Low"/>
      <ReferenceDataRef ReferenceDataOID="IT.SYSBP_HIGHER_LIMIT" Comparator="GE" Interpretation="High"/>
  </ItemDef>

The “ReferenceDataRef” just stating that there is some predefined reference data about how data should be interpreted.

Now, these are just some first, unpolished ideas, so any comments are highly appreciated. Please send them to Jozef.Aerts-at-XML4Pharma.com.