The XML4Pharma Application Server

Validation Rules in XQuery - the "Open Rules for CDISC Standards" initiative

Validation rules for CDISC standards need to be transparent. This means that users have the right to be able to see exactly how the validation rules were implemented. The currently known implementations do not fulfill this requirement. They are only available as a text description (which even is, unfortunately, very often confusing), and it is unknown how they have been implemented.
Users of the current validation tools are confronted with very many "false positives", and there is nothing they can do against this. They can only report it, and hope the implementation is corrected "in the next release", which can be months or even 1-2 years later.

Also read our newest article "A critical view on the FDA and PMDA SDTM validation rules".

Isn't there a better way?

Of course there is! With modern technology, it is possible to publish rules in such a way that they are both human-readable as well as machine-executable. This means that the user can easily inspect how the rule has exactly been implemented. The rule can then be executed using software of the user's own choice, as the rule is independent of the software with which it is executed.

Such a modern technology is XQuery. It is a W3C standard, which means that it is open and free for everyone.
XQuery is very easy to learn - I teach it to my undergraduate students in just 1.5 hours.

Below you will find a set of files containing all the rules developed sofar. We are very regularly updating these files. The last updated date is always provided.

The "open rules" allow anyone to write his own validation software. Here is explained how.

When you inspect these files and the rules in it, you will soon find out that not all FDA and PMDA rules have been implemented. Rules that are inherently wrong, are expectations rather than rules, and rules that have been formulated in a confusing way, have not been implemented. Examples of such rules are FDAC0154 ("Missing value for --ORRESU, when --ORRES is provided" - even an SDTM beginner knows this rule is inherently wrong), rule FDAC084 (""Standard Units (--STRESU) must be consistent for all records with same Short Name of Measurement, Test or Examination (--TESTCD), Category (--CAT), Specimen Type (--SPEC) and Method of Test or Examination (--METHOD)") and many others.
Unfortunately the PMDA copied many of the FDA rules (even the wrong ones) 1:1.

Using the "Open Rules for CDISC Standards"

You can easily use these rules in your own software.
A good starter however is to use these rules within the "Smart Dataset-XML Viewer", which is open source software. If not delivered yet with the distribution, just copy the above listed XML files to the directory "Validation_Rules_XQuery" and they are "ready-to-go". When there is an update/correction, just copy the new file in the same directory, and again you can start using them immediately (no more "waiting until the next release").
A very good manual about how to use the "Open Rules for CDISC Standards" in the "Smart Dataset-XML Viewer" is also available.

Example validation rules in XQuery explained in detail

Another detailed example can be found here. Additional information about passing paramaters from within your own application to the XQuery can be found here.

Retrieving the XQuery rules through a RESTful web service

We have also implemented a Validation Rules RESTful web service, so that your application can itself check which rules have been updated/corrected, and automatically load the updated ones in your application. As each rule has a "last-update" attribute, and this attribute is repeated in the validation messages, you will always which version (by date) of the rule has been used.

XQuery and data formats

A last word about data formats. XQuery does not work with XPT files. Well essentially, no single modern software works with XPT files, unless it is a "legacy feature". XQuery however works very well with XML, so we developed the rules for working with the modern CDISC Dataset-XML format, which we hope to be soon accepted by the FDA as the transport format for SDTM, SEND and ADaM. If your SDTM/SEND/ADaM submission is in XPT (modern mapping tools such as SDTM-ETL use Dataset-XML and "degrade" the tables to XPT in the very last step), you can easily transform your XPT in the modern Dataset-XML format using one of the tools provided by CDISC.

Further remarks:

Implementing XQuery in your own software

XQuery can easily be implemented in your own software programs. In many cases, it is just a few lines of code. Here are some links that give you good information about how to run XQuery in your own software:

You can also download a sample program from our "XQuery Implementation" page, where all important steps are explained.

Please also remember that the "Smart Dataset-XML Viewer" implements the XQuery validation rules for both SDTM and ADaM (SEND to come).

Running XQuery in combination with a native XML database

Many native XML databases have an XQuery graphical user interface that comes with the the database itself:


Courtesy of XML4Pharma - last update: May 2016