Clinical trial data are diverse and influenced by collection methods, trial indication and research objectives. The complexity and amount of data collected during trials is increasing, driven by the wider use of, inter alia, adaptive trial designs, wearable devices, real world data, and an evolving landscape of guidance documents and therapeutic area user guides.
Many regulatory authorities, such as the Food and Drug Administration, require trial data to be submitted to them in a standardized format , namely, the Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) .
However, mapping raw clinical trial data to the SDTM framework can be tedious since raw data formats are diverse and may change during trial execution, resulting in repetitive conversion validation cycles. The initial mapping of source variables to SDTM domains and variables is a key step in the SDTM conversion process.
This paper introduces Bioforum’s next-generation SDTM conversion platform, JETConvert, a machine-learning approach to generate SDTM domain mapping recommendations for domain and variable targets. In addition, the paper also discusses the accuracy of the underlying models and presents refinement steps to improve the accuracy of model predictions.