=== The "Include" mechanism in CDISC ODM Study descriptions ===
Author: Jozef Aerts, XML4Pharma
Applicable to: ODM version 1.3, version 1.2
== Introduction ==
The first possible element within the “MetaDataVersion” element of an ODM file containing a study design (i.e. lists of possible visits, forms, subforms, questions etc.) is the “Include” element.
The CDISC ODM 1.3 ODM specification says about this element: "**//A reference to a prior metadata version. This version must be present earlier in the same ODM file or in a previous file in the series//**".
Nice, but what does this mean? Is it just a reference to a previously defined “MetaDataVersion”, without any consequences, or does the name “Include” suggests us that it means that the previously defined “MetaDataVersion” should be included here?
This Use Case document will explain how the “Include” mechanism works, and supplies a numer of use cases where the use of the “Include” element is recommended, or can be usefull.
== Previous MetaDataVersions ==
The “Include” element has two attributes. These are:
StudyOID: the OID of the current Study or of another Study
MetaDataVersionOID: the reference to the previously defined MetaDataVersion
Let us first concentrate on the second attribute.
In ODM, a “MetaDataVersion” essentially describes a version of the study design. As in any versioning system, versions are being defined subsequently. So one may have a version 1, e.g. with OID “MV.001”, and a second version, e.g. with OID “MV.002”
So the first rule (“ This version must be present earlier in the same ODM file or in a previous file in the series”) means that the second version (MV.002) can have an Include element pointing to the previous version (MV.001), but not vice versa.
The previously defined MetaDataVersion can be in the same file (**before** the current one), or in another file that was created before. So for example, one can have:
A first version of the metadata was created (MV.001) only filled with “ItemDef” elements, and then the second version “MV.002” includes it. Remark that the “ItemDef” elements do not have “Question” subelements yet, as the designer has not decided yet on the exact phrasing of the questions.
This means that there is now a second version which already includes the first version (Russian Doll system).
One can now also have a third version of the metadata, including the second one, e.g.:
...
Common
General
Algemein
...
The second metadata version defines some subforms (ItemGroups) and references some questions (Items) that were already defined in the first metadata version.
Similarly, the third metadata version could define more ItemGroups, referencing ItemDef elements from the first metadata version (as the third includes the second which includes the first ...), or it could e.g. define forms (using the FormDef element), and reference ItemGroups (subforms) that have been defined in the second metadata version. So essentially, one could set up a chain of metadata versions, each one including a previous one.
[Russian doll image here]
So, if we have a software that **implements** the Include mechanism, the resulting **output file** would look like:
Common
General
Algemein
...
...
So, the third metadata version has “included” the two previously ones, and assimilated it contents.
Remark that one also sees that the elements have been regrouped, as “FormDef” needs to come before “ItemGroupDef” which needs to come before “ItemDef”.
This brings us to a first use case:
== Case: Using ODM libraries ==
Companies will not want to put up a list of questions (to appear on CRFs) each time they define a study. Instead, they will want to have a library of questions from which they can select. This library can be an ODM file (or an export from a database into an ODM file).
For example, a company may have a standard lists with questions to appear in certain studies, let us say oncology studies. They can now copy the ItemDefs for such studies into the ODM file, or alternatively, they can just add in “Include” to the MetaDataVersion element of the specific study, e.g.:
...
...
The first version of the metadata for the specific oncology “loads” all ItemDefs from another metadata version (the general library with all questions that can be used in oncology studies), and then references these in “ItemRef” elements.
An important first remark is that, in case the questions are not defined in the same file in a separate “MetaDataVersion” element (that needs to come before the current one), the Include element does not state where they are actually located (i.e.file location), only that there is an ODM file somewhere for which the identifiers are the StudyOID (“ONCOLOGY_GENERAL”) and the MetaDataVersionOID (“MV.ONCOLOGY_GENERAL_ITEMDEFS”). So it is up to the implementating application to take care that the ODM files to be included can be found[footnote].
Using ODM libraries and referencing them by an “Include” statement can especially be interesting for questions (ItemDefs), subforms (ItemGroups) and codelists.
== Case: replacing information ==
In the ODM, **//OIDs//** act as unique IDs, i.e. within the same study, OIDs should be unique[footnote]. If an OID is used in a definition (StudyEventDef, FormDef, etc.) and it already occurred in a previous version of the metadata, this means that it needs to replace the previously defined version, or extend it.
The ODM specification says:
"**An individual Study element may contain multiple MetaDataVersions, reflecting one or more mid-stream study design changes. The initial version contains a full set of metadata. Each subsequent version typically contains only modified or newly added metadata elements, inheriting the previous metadata by an explicit reference. The same metadata elements in different versions will have the same OID. This approach is used to allow the older versions of the metadata to remain intact, while simultaneously providing a concise way to represent changes.** "
and:
"**Any of the included definitions can be replaced (overridden) by explicitly giving a new version of the definition (with the same OID) in the new metadata version. New definitions (with new OIDs) can be added in the same way.** "
Suppose we have a question defined in a first version of the metadata like:
...
Essentially, this definition allows to enter free text for the race.
Suppose now that a codelist was derived for race (i.e. an enumeration of possible values for the race). Each possible value has a code (an integer) and a decoded value (which can differ according to the language the form is being deployed). So we need to change the datatype as well as add a reference to the codelist with coded and decoded values. The way this is done is:
...
As an ItemDef element can only have one “Question” and one “CodeListRef”, it is clear that the new value should replace the old one (for “Question”) and that the “CodeListRef” should be added. Also, it is clear that the “DataTypes” and the “Length” in the second version are replacements for those in the first version.
As a rule we can state that: "//**If in a later MetaDataVersion of the same defined item (where item can be any element having an OID), a piece of information is added for which there is only one instance possible (this is true for attributes and for subelements where the maximum number of occurrences is 1), then the new information replaces the old information**//".
Later we will see how we can replace pieces of information for which several occcurences are possible, such as “ItemRef” elements with “ItemGroupDef”.
== Case: addition of new information ==
It may happen that after a clinical study has been designed (and described in an ODM file), it is decided to add extra information, such as extra forms to visits, extra subforms to forms, and/or extra questions to subforms.
We will here show how this can be done in the additional forms should be added to a visit (StudyEvent) that was already defined in a previous metadata version.
Suppose we have a StudyEvent definition stating that two forms should be used. In ODM, this is expressed like:
The second MetaDataVersion (which can be defined in the same file, or in a later file) contains:
It states that the first version should be included.
As in the first version, a StudyEventDef was already defined with the OID “VISIT1”, the second version takes it, and adds the newly defined FormRefs to it, replacing the older information.
In first instance, one might think that the result would be:
However, this is NOT the way the Include mechanism works.
The reason is that there is a rule that the OIDs for StudyEventDef elements (defining the visits) should be unique within the study.
The way the Include mechanism works however leads to the result:
The new FormRef elements FM.003 and FM.004(from the second MetaDataVersion) REPLACE the ones defined in the first MetaDataVersion (FM.001 and FM.002), and the result is stored in the second MetaDataVersion.
Remark: In the above snippet, I commented out (using ) the FormRef's for FM.001 and FM.002 to highlight that these are not referenced in the new metadata version anymore.
So the general rule here is:
"**//If in a later MetaDataVersion of the same defined item (where item can be any element having an OID), a piece of information is added for which there can be more than a single instance, then the new information REPLACES
the already present information//** ."
So in order to __add__ FM.003 and FM.004 to the StudyEvent VISIT1, one needs to COMPLETELY replace the definition of VISIT01:
== Case: changing information ==
The complete re-definition of an item when updating information from a previous MetaDataVersion may seem to be cumbersome, but is necessary.
Consider the following example, where we define a question on a CRF
Systolic Blood Pressure
Systolischer Blutdruck
Tension artérielle systolique
Systolic blood pressure (mm Hg)
Systolischer Blutdruck (mm Hg)
Tension artérielle systolique (mm Hg)
Now, in the second version, we want to state that the blood pressure may be measured more precisely, and replace the datatype "integer" by "float". It may be tempting to do this using:
However, this is not completely unambigous. It is clear that the data type was changed, but does it mean that the question needs to be removed from the definition? That would be valid, as "Question" is optional. It may e.g. be absent in case the datapoint is derived from other ones (such as in the case of a BMI).
So, if one wants to replace one property, one also needs to repeat all the other properties of that item, as otherwise it is not 100% clear what is desired.
So, in our case, the second metadata version must be:
Systolic Blood Pressure
Systolischer Blutdruck
Tension artérielle systolique
Systolic blood pressure (mm Hg)
Systolischer Blutdruck (mm Hg)
Tension artérielle systolique (mm Hg)
== Case: Inserting information ==
Similarly, consider the case one wants to insert a form into a visit.
For example, the first MetaDataVersion may define the visit as follows:
Four forms have been defined to be used in the first visit, and their order is given.
Now suppose that the designer of the study has changed his/her mind about which forms need to be used in the first visit. Instead of form “FM.002” he/she wants another form, e.g. a form “FM.007”.
Also, a form "FM.008" should be placed before form "FM.004"
The way the replacement should be defined is:
So in order to avoid ambigity, one needs to give a complete description of the visit, this although this may sometimes seem to be "overkill".
== Case: Changing the order ==
In a similar way, the order of questions in a subform (ItemRef in ItemGroupDef), the order of subforms in a form (ItemGroupRef in FormDef), the order of forms in a visit (FormRef in StudyEventDef), and the order of the visits themselves (StudyEventRef in Protocol) can be changed.
Suppose we have the following definition:
Now, we want to change the order of the use of the third and fourth form.
It could be tempting to only give the changes:
but this is incorrect, as it suggests that forms FM.001 and FM.002 are being removed from the visit.
So the correct way of doing this is:
== Removing information ==
The specification clearly states that "**//Each subsequent version typically contains only modified or newly added metadata elements//**". This means that essentially, the 'Include' mechanism does not allow to remove information in a study design.
This is not a problem, as long as the defined item is not referenced. As long is an “ItemDef” is not referenced in an “ItemRef”, essentially it is not used. The same is true for codelists (CodeList), subforms (ItemGroupDef), forms (FormDef), visits (StudyEventDef), but also for “ArchiveLayout”, “Presentation”, “ImputationMethod” (deprecated), “MethodDef”, “ConditionDef”[6].
So, in case we want to remove a question from a subform, subform from a form, form from a visit, we just leave its reference out of the parent definition. For example:
We now want to remove the third form from the visit. So we reference all forms again, except the form FM.003:
Explicitely assigning OrderNumber "3" to FM.004 emphasises that FM.003 has been removed, but it would also have been removed, even if we assign OrderNumber "4" to FM.004.
Remark that this mechanism does not remove the **definition** of form FM.003. It only removes the **usage** of that form in visit "VISIT1".
== Case: Defining different Arms in a study ==
The “Include” mechanism does not only allow to define subsequent versions of the study design without the need of redefining the whole study, it also allows to generate parallel versions of the metadata, without the need of having to define questions, forms etc. twice.
An example of such parallel versions of the metadata is where the visits, forms etc. for one arm of the study are described in one MetaDataVersion element, and the visits, forms ... for another arm are described in another MetaDataVersion element.
In such a case, one can start from a metadata version which defines only those elements that are common to both arms (the base version), and then only describe what is different for both arms in separate versions of the metadata, which have an “Include” referencing the base metadata version.
[ image to come here ]
For example, in the base metadata version, one can define all the possible forms, subforms (ItemGroups), questions, codelists, etc.., and then use that as a library to retrieve from when defining the visits for each of the different arms:
...
...
...
...
...
...
...
...
In separate MetaDataVersions for “Arm A” and “Arm B”, we can now just “extract” the definitions from the base metadata version, by referencing them in our visit definitions (StudyEventDef). For example, for “Arm A”:
...
and the definitions for “Arm B”:
...
Remark that, although both versions of the metadata are parallel to each other (i.e. none of them includes the other, the StudyEventDefs must have different OIDs, as it is not allowed to have duplicate OIDs for the StudyEvents within the Study.
When taking a close look, one also sees that the form with OID “FM.012” is referenced in both arms, i.e. it is used in both arms, but in a different context.
== Literature ==
CDISC ODM 1.3 Final Specification: [http://www.cdisc.org/models/odm/v1.3/index.html]