Guest Column | October 4, 2018

R&D Data Sharing: Where We Are & Where We Need To Go

By Virginia Nido, Global Head, Industry Collaborations, Roche


The clinical research industry has been slow to efficiently use and repurpose data collected in clinical trials. The lack of ease, both from an ethos and technical perspective, with which pharmaceutical companies can access and share data can lead to lengthier product development, untapped study findings, and reduced collaboration between industry stakeholders. As trials become increasingly complex and costly, it is especially important to develop approaches that facilitate and encourage R&D data sharing. This not only helps clinical trial sponsors to create better informed clinical development plans and run smaller, more efficient trials, but allows patients to enroll in trials sooner and, therefore, bring innovative new therapies to the market faster.

While there is a clear need to advance data-sharing practices, certain issues continue to hold the industry back. First, stakeholders have maintained a mind-set that everything is intellectual property and almost all data is proprietary. Although some data should be considered proprietary, it is important to differentiate what can be safely shared to further scientific knowledge in a therapeutic area or about a mechanism of action. Second, the industry has yet to fine-tune technology that standardizes the way clinical data is collected. Even with mandatory standardizations instituted, such as the Clinical Data Interchange Standards Consortium (CDISC) data standards, pooling data sets and integrating claims and electronic health record (EHR) analysis to help inform clinical development plans or interpret trial results remains difficult.

Despite these challenges, the biopharmaceutical industry has made promising progress, with a few well-established initiatives already influencing data-sharing processes. For example, Project DataSphere is a free digital library-laboratory that provides one location where the research community can share patient data from Phase 3 oncology clinical trials. Thus far, the platform has approximately 170 data sets, 30 data providers, 13,000 downloads, and 2,000 users, making it easier to develop effective solutions for cancer patients. and Vivli are two additional global data-sharing and analytics platforms designed to promote, coordinate, and facilitate the reuse of clinical data. Because of these efforts, industry stakeholders can more efficiently share, integrate, and access the data they need to improve trial results.

TransCelerate BioPharma Inc. is also designing solutions to promote data sharing, starting with the Placebo Standard of Care (PSoC) initiative. Available to all 19 member companies, the PSoC initiative enables sharing of de-identified data in the placebo and standard of care control arms of clinical trials. The initiative has helped produce valuable insights by improving trial design, speeding up study execution, and reducing participant burden. In fact, as a member of TransCelerate, Roche directly experienced the value of PSoC during a recent study that aimed to understand the implications of varying regional adverse event reporting rates and placebo responses for rheumatoid arthritis trials. The success and effectiveness of the study was largely dependent on the patient-level data that was borrowed from another member company. Having access to this data ultimately helped inform decisions about Roche’s Phase 3 recruitment strategies and contextualize results from a Phase 2 interim analysis.

In an effort to enhance progress toward more efficient data-sharing practices, TransCelerate recently launched its new technology platform, DataCelerate. The global platform will allow for multiple de-identified research and development data types to be submitted, uploaded, converted, harmonized, and downloaded through an access-controlled, secured environment. Through this network, there is the potential to connect clean, converted clinical and preclinical information with one “data lake” solution. Through BioCelerate, the first data set — preclinical toxicology data — is live and accessible in DataCelerate. TransCelerate’s PSoC data repository will be migrated into DataCelerate, expanding the volume of data to include over 85,000 patients and an excess of 130 studies in nearly 20 therapeutic areas ranging from diabetes to rare conditions such as Duchenne muscular dystrophy. Looking forward, DataCelerate will explore how connections can be made between and among existing platforms.

However, what makes the platform truly unique is the collaborative environment in which the platform was built. The value of this system can only grow as data volume increases, data types expand, and disparate forms of data are able to connect within disease areas or drug classes, generating benefits for sponsors and patients. Access to this platform now and in the future can make a powerful, lasting impact on clinical research.

As mentioned previously, having access to data is critical, but standardizing that data is equally important. Because data standards provide consistency in the structure and meaning of data, they can increase both the quality of and confidence in the data. For instance, having standardized data facilitates the ability to combine and harvest its value by using it for further exploration of safety and efficacy analysis and to test disease model hypotheses. Clinical data standards applied across sponsor companies facilitate health authorities’ ability to review standardized study data provided in regulatory submissions and to evaluate the safety and efficacy across medicines submitted by various sponsor companies.

Furthermore, clinical trial information isn’t just collected and reported — there’s a lot that happens in between, and sponsors must have a clear view of potential gray areas. Health authorities must see that information can be traced backward and forward – like vertebra on a spine. Use of agreed-upon data standards can help answer some questions from health authorities, such as:

  • How did you collect the data? What was the source? If it was an eSource, was it an electronic patient diary, an EKG machine, imaging, etc.?
  • What transformations did you apply post-collection? How did you represent the collected data in tabulation format?
  • How did you use the tabulation data to generate your analysis data sets?

Overall, implementing data standards elevates the chance that the data will be understood, which will give health authorities more confidence in the data and ultimately enable them to make better decisions.

Moving forward, the data sharing journey will evolve as technology continues to innovate, but perhaps more important to the trajectory of data sharing in clinical research is the exploration and identification of what data can be shared across industries and how we can make that data more digestible and meaningful to the stakeholders we serve most — patients.

About The Author:

Virginia Nido is the global head of product development Industry collaborations for Roche and Genentech. She serves on the Oversight Committee of TransCelerate BioPharma; the Executive and Steering Committees for Clinical Trials Transformation Initiative (CTTI); the Board of Trustees of the Association for Clinical Research Professionals (ACRP); and is a Global Impact Partner of the Society for Clinical Research Sites (SCRS). She holds an MSEd from the University of Pennsylvania and a BA from Barnard College. With over 20 years in the industry, Nido is passionate about improving and transforming clinical trials through industry collaborations.