Guest Column | May 18, 2023

Understanding Data Collection And Management In Decentralized Clinical Trials (DCTs)

By Rashida Rampurawala, Manager - Clinical Data Management, GSK

Rampurawala_figure-1a-(002) — Data flow from multiple sources

Over the years, the pharmaceutical industry has evolved from collecting data on paper to using electronic methods. The COVID-19 pandemic, which frequently prevented patients from attending site visits, impacted recruitment, enrollment, and retention rates across global clinical trials. In some cases, clinical trials were temporarily suspended or postponed, however it was clear during this period of time active clinical trials were managed differently by the research teams across the industry. As a result, the interest in and need for decentralized clinical trials (DCTs) grew, as did the need for decentralization-supporting capabilities.

Yet even before the pandemic, biopharmaceutical companies, CROs, and manufacturers of medical devices were already on the path toward rethinking the clinical development paradigm. They began executing experimental projects to instream data collection digitally, often using telemedicine to replace in-person site visits. Looking back, we note that these before-, during-, and after-COVID measures increased industry learning, decreased concerns about potential risks, and increased confidence in alternative clinical trial methods.

In particular, these pioneering efforts have impacted how we view decentralization, distant data collection, and data capturing. However, it would be challenging to adopt a one-size-fits-all decentralization paradigm to any clinical trial design due to protocol complexity and individuality and its key objectives. As a result, it is important to specify the risks up front and determine how decentralization is appropriate for each protocol. From a data collection and management standpoint, there is plenty to consider in terms of data flow management through multiple sources and systems.

Understanding Data Collection Within A DCT

The collection, management, and interpretation of data from new sources and in new forms is one of the greatest challenges DCTs pose for clinical teams. While wearable sensors can gather a ton of relevant real-time patient data, the question arises of how to efficiently manage and decipher key data relevant to the clinical trial objectives. Equally important is the capacity to combine all of that data into a coherent whole that all parties can use to access key trial-related choices. Companies are reassessing the tools they use to administer trials and are looking for options to integrate the data as well as manage services. To enable businesses to view all the information in one location, the technology must make handling all the downstream data simpler and faster. However, gathering all relevant and clean data under one umbrella comes with a fleet of challenges and requires spill-proof planning.

Parsing Required, Unwanted, And Nice-To-Know Data

Decentralized trials have produced a large volume of data in an array of data types, so ensuring the dependability and quality of that data is the topmost concern. This concern necessitates adaptable and integrated clinical trial systems that can manage data in various ways for different needs. For instance, the raw data stream from a patient-worn device probably contains a lot more data than is necessary to assess the device's effectiveness and safety. The required data from that stream should be easy to retrieve from the clinical trial system in a way that is useful — and ensures it will also comply with regulatory regulations. The unwanted data that is likely to be present in the raw data must also be dealt with, so precise extraction as required by the study team is crucial. Another concern with data integrity includes inconsistencies, device errors, and missing data collected when a patient removes, loses, or even changes the device.

When unstructured data is collected via wearables or other devices within a DCT, it is collected much faster than manual data entry within the EDC but its data cleaning takes longer. Having said that, there are clintech companies diligently working toward technology advancements in data analytics and integration to be used to screen and only evaluate relevant data using a risk-based approach. As it stands, industry is in the process of convincing the authorities that the data provides an entirely accurate representation of every patient's experience and makes timely data-driven judgements upon examining the data. For instance, when a patient's heart rate unexpectedly increases, investigators would need to manually figure out why a sensor reading lies outside of the normal range; there is currently no quick technique to resolve these types of issues.

The challenge will be to create new algorithms or update existing algorithms that can facilitate the instream of required data and filter out unwanted or broken data. To go a level deeper, another algorithm might compartmentalize the required data separate from the supporting or “nice to know” data, as this gives the clinical teams the opportunity to gain additional insights for current and future research.

Integrating Data Collection And Analysis Systems

As the industry is evolving with new integration systems and tools, we need to introduce and integrate platforms that can handle and compartmentalize the high volume, dimensional data. The increased use of third-party data sources and the mercurial nature of some clinical data adds another level of complexity. In brick-and-mortar clinical trials, the principal investigator, or a delegate, enters the data, and it is stored in the EDC. However, with DCTs, the data storage is split among the EDC, third-party applications, and other sources and are independently managed and reconciled with the EDC as part of data validation process across the industry. Increasingly, key efficacy and safety data are coming from outside the eCRF, which further increases pressure on data integration strategies.

Integrating systems comes with its own line of processes and documentation, including the testing and system validation needed prior to integration (E2E process management). And all integrations will vary, depending on how the decentralized trial is set up, for example, an integration from the ECG machines to the EDC or wearables data into the EDC, etc. Server issues, system upgrades, data migration, updates to the data point, and when data is to be collected based on protocol amendments are some very significant aspects that require deep thinking, pre-planning, risk identification, and mitigation. When setting up a system to manage patient data from remote visits, the clinical site may not have a querying interface, which leads the project team to be dependent on issue logs and to clarify data inconsistencies, requests for reentering data, and making additional data transfers, etc. As this example shows, sponsors need an infrastructure that can compile data effectively and flexibly. When trials are entirely digital and paperless, sites must embrace eSource, which poses an additional problem. While an EDC has traditionally supplied the basis for clinical trial data and served as a standard by which other sources may be measured, data under these circumstances are different.

Recognizing Limitations Within Vulnerable Patient Populations

Another bottleneck in applying decentralization in terms of data collection is highly dependent on the therapeutic area and the type of patient pool. For vulnerable patient pools like pediatrics and geriatrics, introducing technology for patient use will not produce desirable results in terms of data collection. For example, asking children to interact with wearables or portable devices like tablets is not preferrable and is often not feasible. Similarly, in rare diseases, the results of each patient could have an impact on how the other results are perceived; pharmaceutical companies ought to be allowed to alter the treatment plan right away if the diagnostic results of one patient force a change. They become responsible for failing to stop potentially dangerous treatments if there is a delay before the data can be collected, cleaned and reviewed. If decentralized trials — and not just data collection — are to operate in real time, end-to-end data flows must be established. It will be necessary for flexible trials, clinical trial agility, and ensuring the veracity of data, among other things.

Understanding Data Outflow And Compliance With Data Standards And Protection

Despite working toward stabilizing the approach to standardization and compliance, the pharmaceutical industry still encounters challenges arising from diversity and complexity in general. Standardization is crucial when there are multiple external data types coming from different vendors, and they are being reported/integrated in different ways. With the conventional clinical trials data management system, there are still challenges when it comes to data mapping and standardization. When a data manager is designing/building EDC, the team needs to understand the outflow and reporting of the data that needs to meet data standards. For example, we have seen multiple data mapping and data definition issues arising during the Study Data Tabulation Model (SDTM) mapping, which essentially provides a standard for organizing and formatting data to streamline processes in collection, management, analysis and reporting especially when managing external data.

Data Safety And Security Must Be Top Of Mind

With all the excitement over what technology could accomplish with data collection and management, it is simple to overlook the threats technology may pose. The data that pharmaceutical companies or sites collect, clean, and analyze must be protected at every stage of the journey, particularly in a DCT environment. During on-site visits, personnel rely on a tight web security strategy, inclusive of a firewall with personal logins to ensure data security. However, in DCTs, there are multiple limitations for sites or pharmaceutical companies to establish such protection. There is also a risk of collection trial participants’ personal data, which is now restricted as per the data protection act and patient privacy guidelines in most of the European countries. The journey of patient data from its original source to the trial database encompasses multiple integration points. Patients may employ unsecured Wi-Fi; input or report data via mobile phones, tablets, and laptops that are not password protected; and collect trial data via external vendor applications on personal devices like smart watches. Without data security procedures, pharmaceutical companies run the danger of being fined for breaching data protection laws, losing data, and compromising the integrity of study data. Hence a complete system validation, providing secure Wi-Fi connections and ensuring the devises used for DCT are 2 tier password protected, and foolproof firewall setups at the site systems are some of the measures to be taken and monitored on regular basis.

In May 2023, the FDA released draft guidance — Decentralized Clinical Trials for Drugs, Biological Products, and Devices Draft Guidance for Industry, Investigators, and Other Stakeholders — giving primary attention to endpoint management and safeguarding and the preservation of clinical records and data. Pioneers in the industry who plan to integrate DCTs into their ongoing research process must join discussions that could help shape regulations that will allow for secure yet easily accessible data collection in the future. DCT models, which function on a global scale, can only thrive if patients have faith in the security of their data and regulators are confident that these technologies can adhere to all data privacy regulations. Although 71% of countries have passed data protection and privacy legislation, according to the United Nations Conference on Trade and Development, steering through the diverse regulatory requirements is an intricate affair. Delaying action until a prominent data breach occurs could harm the reputation of DCTs as a dependable research model and trigger more stringent regulations that might impede innovation.

This does not mean we must return to physical evaluations and written study records. Alternatively, it indicates that we, as a sector, must thoroughly examine the technological framework, software, and protocol designs we rely on to guarantee the safety of data from collection to reporting to analysis. Modern methods will necessitate a cooperative endeavor among regulators, industry, and academia to set benchmarks for data collection and reporting within a DCT.

About The Author:

Rashida Rampurawala is a manager - clinical data management at GSK. She pursued master’s in biomedical sciences from UEL in UK and has an PG diploma in business management from XLRI in India. She started her career in the UK and then moved back to India in 2011. She has 13-plus years of CDM experience, has presented at various conferences held by ACDM, SCDM, DIA, ISCR, PHUSE, and has conducted RBM workshops at DIA and ISCR. She is a data visualization enthusiast and is a part of the SCDM author group, which is updating the GCDMP chapters for CCDM certification.